Skip to content

Commit dbc89eb

Browse files
committed
commit
Signed-off-by: SumanthRH <[email protected]>
1 parent 834dbf7 commit dbc89eb

File tree

2 files changed

+8
-2
lines changed

2 files changed

+8
-2
lines changed

skythought/evals/README.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,12 @@ skythought evaluate --model Qwen/QwQ-32B-Preview --task aime --backend ray --bac
4343

4444
By default, we make use of the configuration in [ray_configs/ray_config.yaml](./ray_configs/ray_config.yaml). You can also customize the following parameters for ray:
4545

46+
- `tensor_parallel_size`: Tensor Parallel Size per replica. Defaults to 4.
47+
- `accelerator_type`: GPU accelerator type. For more information see the list of available types: https://docs.ray.io/en/latest/ray-core/accelerator-types.html. Defaults to None (uses any GPUs available in the ray cluster)
48+
- `num_replicas`: Number of model replicas to use for inference. Defaults to 2.
49+
- `batch_size`: Batch size per model replica for inference.
50+
- `gpu_memory_utilization`: The fraction of GPU memory to be used for vLLM's model executor. Defaults to 0.9
51+
- `dtype`: Data type for inference. (Defaults to "auto")
4652

4753
### Optimized settings for 32B and 7B models
4854

@@ -54,7 +60,7 @@ For 32B models, we recommend using the default backend configuration for best pe
5460
skythought evaluate --model Qwen/QwQ-32B-Preview --task aime24 --backend ray --result-dir ./
5561
```
5662

57-
For 7B models, we recommend using `tensor_parallel_size=1` and `num_replicas=8` for best performance. FOr example, the previous command will change to:
63+
For 7B models, we recommend using `tensor_parallel_size=1` and `num_replicas=8` for best performance. For example, the previous command will change to:
5864

5965
```shell
6066
skythought evaluate --model Qwen/Qwen2-7B-Instruct --task math500 --backend ray --backend-args tensor_parallel_size=1,num_replicas=8 --result-dir ./

skythought/evals/ray_configs/ray_config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
llm_engine: vllm # currently only vllm supported
2-
accelerator_type: H100 # accelerator name as specified here: https://docs.ray.io/en/master/ray-core/accelerator-types.html#accelerator-types
2+
accelerator_type: null # accelerator name as specified here: https://docs.ray.io/en/master/ray-core/accelerator-types.html#accelerator-types
33
engine_kwargs: # vllm engine kwargs
44
tensor_parallel_size: 4
55
gpu_memory_utilization: 0.9

0 commit comments

Comments
 (0)