-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restructure opengpt-x tasks #86
Conversation
Reminder: Don't forget to move the tasks on branches that are not merged yet |
…patibility) ; --limit can now be a fraction of the dataset size
…-evaluation-harness into organization-tasks
@katrinklug |
@katrinklug #87 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@katrinklug There seems to be inconsistencies with some of the arguments. E.g. you included no_tokenizer_check for "hf" and "gpt2" models but missed to also add it to other model types e.g. "hf-causal" or "hf-causal-experimental" or "gpt3" etc.. I have added a temporary fix in the ...thellmann1/workdir/lm_eval_setup/opengptx/lm-evaluation-harness/lm_eval/evaluator.py
(from line 67), but this type of args definitely belong to the model args and need to be handled differently:
# this is just a temporary fix
if isinstance(model, str):
if model_args is None:
model_args = ""
additional_args = {
"batch_size": batch_size,
"device": device,
}
if model in ("hf", "gpt2"):
additional_args["no_tokenizer_check"] = no_tokenizer_check
lm = lm_eval.models.get_model(model).create_from_arg_string(
model_args,
additional_args,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@katrinklug #87
I have implemented a patch for this. Can you please apply the patch?
You'll find it in lm_eval_setup/0001-Pass-trust_remote_code-from-model_args-to-AutoTokeni.patch
on the lm_eval_setup
branch.
e.g.
git am ~/path/to/evaluation/repo/lm_eval_setup/0001-Pass-trust_remote_code-from-model_args-to-AutoTokeni.patch
Restructering OpenGPT-X tasks to locate them in an own folder. Thus we have a better overview about the tasks and can implement upstream changes easier