Skip to content

Commit 2a5e49f

Browse files
authored
Update README.md
1 parent fd0ae5a commit 2a5e49f

File tree

1 file changed

+10
-18
lines changed

1 file changed

+10
-18
lines changed

README.md

Lines changed: 10 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ This project provides a unified framework to test generative language models on
1313
- Evaluation with publicly available prompts ensures reproducibility and comparability between papers.
1414
- Easy support for custom prompts and evaluation metrics.
1515

16-
The Language Model Evaluation Harness is the backend for 🤗 Hugging Face's popular [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), has been used in [hundreds of papers](https://scholar.google.com/scholar?oi=bibs&hl=en&authuser=2&cites=15052937328817631261,4097184744846514103,17476825572045927382,18443729326628441434,12854182577605049984) is used internally by dozens of companies including NVIDIA, Cohere, Booz Allen Hamilton, and Mosaic ML.
16+
The Language Model Evaluation Harness is the backend for 🤗 Hugging Face's popular [Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard), has been used in [hundreds of papers](https://scholar.google.com/scholar?oi=bibs&hl=en&authuser=2&cites=15052937328817631261,4097184744846514103,17476825572045927382,18443729326628441434,12854182577605049984) is used internally by dozens of companies including NVIDIA, Cohere, Nous Research, Booz Allen Hamilton, and Mosaic ML.
1717

1818
## Install
1919

@@ -47,8 +47,7 @@ We also provide a number of optional dependencies for . Extras can be installed
4747
To evaluate a model hosted on the [HuggingFace Hub](https://huggingface.co/models) (e.g. GPT-J-6B) on `hellaswag` you can use the following command:
4848

4949
```bash
50-
lm_eval \
51-
--model hf \
50+
lm_eval --model hf \
5251
--model_args pretrained=EleutherAI/gpt-j-6B \
5352
--tasks hellaswag \
5453
--device cuda:0 \
@@ -58,8 +57,7 @@ lm_eval \
5857
Additional arguments can be provided to the model constructor using the `--model_args` flag. Most notably, this supports the common practice of using the `revisions` feature on the Hub to store partially trained checkpoints, or to specify the datatype for running a model:
5958

6059
```bash
61-
lm_eval \
62-
--model hf \
60+
lm_eval --model hf \
6361
--model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float" \
6462
--tasks lambada_openai,hellaswag \
6563
--device cuda:0 \
@@ -71,8 +69,7 @@ Models that are loaded via both `transformers.AutoModelForCausalLM` (autoregress
7169
Batch size selection can be automated by setting the ```--batch_size``` flag to ```auto```. This will perform automatic detection of the largest batch size that will fit on your device. On tasks where there is a large difference between the longest and shortest example, it can be helpful to periodically recompute the largest batch size, to gain a further speedup. To do this, append ```:N``` to above flag to automatically recompute the largest batch size ```N``` times. For example, to recompute the batch size 4 times, the command would be:
7270

7371
```bash
74-
lm_eval \
75-
--model hf \
72+
lm_eval --model hf \
7673
--model_args pretrained=EleutherAI/pythia-160m,revision=step100000,dtype="float" \
7774
--tasks lambada_openai,hellaswag \
7875
--device cuda:0 \
@@ -81,16 +78,15 @@ lm_eval \
8178

8279
Alternatively, you can use `lm-eval` instead of `lm_eval`.
8380

84-
> ![Note]
81+
> [!Note]
8582
> Just like you can provide a local path to `transformers.AutoModel`, you can also provide a local path to `lm_eval` via `--model_args pretrained=/path/to/model`
8683
8784
#### Multi-GPU Evaluation with Hugging Face `accelerate`
8885

8986
To parallelize evaluation of HuggingFace models across multiple GPUs, we leverage the [accelerate 🚀](https://github.com/huggingface/accelerate) library as follows:
9087

9188
```
92-
accelerate launch -m lm_eval \
93-
--model hf \
89+
accelerate launch -m lm_eval --model hf \
9490
--tasks lambada_openai,arc_easy \
9591
--batch_size 16
9692
```
@@ -115,8 +111,7 @@ accelerate launch --no_python lm-eval --model ...
115111
We also support vLLM for faster inference on [supported model types](https://docs.vllm.ai/en/latest/models/supported_models.html).
116112

117113
```bash
118-
lm_eval \
119-
--model vllm \
114+
lm_eval --model vllm \
120115
--model_args pretrained={model_name},tensor_parallel_size={number of GPUs to use},dtype=auto,gpu_memory_utilization=0.8 \
121116
--tasks lambada_openai \
122117
--batch_size auto
@@ -177,8 +172,7 @@ If you have a Metal compatible Mac, you can run the eval harness using the MPS b
177172
To verify the data integrity of the tasks you're performing in addition to running the tasks themselves, you can use the `--check_integrity` flag:
178173
179174
```bash
180-
lm_eval \
181-
--model openai \
175+
lm_eval --model openai \
182176
--model_args engine=davinci \
183177
--tasks lambada_openai,hellaswag \
184178
--check_integrity
@@ -188,8 +182,7 @@ lm_eval \
188182
189183
For models loaded with the HuggingFace `transformers` library, any arguments provided via `--model_args` get passed to the relevant constructor directly. This means that anything you can do with `AutoModel` can be done with our library. For example, you can pass a local path via `pretrained=` or use models finetuned with [PEFT](https://github.com/huggingface/peft) by taking the call you would run to evaluate the base model and add `,peft=PATH` to the `model_args` argument:
190184
```bash
191-
lm_eval \
192-
--model hf \
185+
lm_eval --model hf \
193186
--model_args pretrained=EleutherAI/gpt-j-6b,parallelize=True,load_in_4bit=True,peft=nomic-ai/gpt4all-j-lora \
194187
--tasks openbookqa,arc_easy,winogrande,hellaswag,arc_challenge,piqa,boolq \
195188
--device cuda:0
@@ -198,8 +191,7 @@ lm_eval \
198191
[GPTQ](https://github.com/PanQiWei/AutoGPTQ) quantized models can be loaded by specifying their file names in `,gptq=NAME` (or `,gptq=True` for default names) in the `model_args` argument:
199192
200193
```bash
201-
lm_eval \
202-
--model hf \
194+
lm_eval --model hf \
203195
--model_args pretrained=model-name-or-path,gptq=model.safetensors,gptq_use_triton=True \
204196
--tasks hellaswag
205197
```

0 commit comments

Comments
 (0)