# I am the Watcher. I am your guide through this vast new twtiverse.
#
# Usage:
# https://watcher.sour.is/api/plain/users View list of users and latest twt date.
# https://watcher.sour.is/api/plain/twt View all twts.
# https://watcher.sour.is/api/plain/mentions?uri=:uri View all mentions for uri.
# https://watcher.sour.is/api/plain/conv/:hash View all twts for a conversation subject.
#
# Options:
# uri Filter to show a specific users twts.
# offset Start index for quey.
# limit Count of items to return (going back in time).
#
# twt range = 1 1
# self = https://watcher.sour.is/conv/zrsuy6a
Beating GPT-4 on HumanEval with a fine-tuned CodeLlama-34B
Hi HN,
We have fine-tuned CodeLlama-34B and CodeLlama-34B-Python on an internal Phind dataset that achieved 67.6% and 69.5% pass@1 on HumanEval, respectively. GPT-4 achieved 67%. To ensure result validity, we applied OpenAI's decontamination methodology to our dataset.
The CodeLlama models released yesterday demonstrate impressive performance on HumanEval.
\\- CodeLlama-34B achieved 48.8% pass@1 on HumanEval
\\- CodeLlama-34B-Python achieved 53.7% pass@1 on H ... ⌘ Read more