Skip to content

Commit a23fb5b

Browse files
authored
[release] v0.3.2 (#502)
1 parent f61f77b commit a23fb5b

File tree

3 files changed

+24
-24
lines changed

3 files changed

+24
-24
lines changed

README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,8 @@ apptainer pull easyr1.sif docker://hiyouga/verl:ngc-th2.7.1-cu12.6-vllm0.10.0
5656
apptainer shell --nv --cleanenv --bind /mnt/your_dir:/mnt/your_dir easyr1.sif
5757
```
5858

59+
Use `USE_MODELSCOPE_HUB=1` to download models from the ModelScope hub.
60+
5961
### Hardware Requirements
6062

6163
\* *estimated*

assets/baselines.md

Lines changed: 21 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,28 @@
11
# Baselines
22

3-
Environment: [hiyouga/verl:ngc-th2.6.0-cu126-vllm0.8.3-flashinfer0.2.2-cxx11abi0](https://hub.docker.com/layers/hiyouga/verl/ngc-th2.6.0-cu126-vllm0.8.3-flashinfer0.2.2-cxx11abi0/images/sha256-335ed6cd1fe73090e458409cfa4394d6abf4cd0503ca44dbafdc28ff72e5ed20)
3+
Environment: [hiyouga/verl:ngc-th2.7.1-cu12.6-vllm0.10.0](https://hub.docker.com/layers/hiyouga/verl/ngc-th2.7.1-cu12.6-vllm0.10.0/images/sha256-cfc8c1ce3ea52dee0444f3e58e900d0b1d3b6b315deaf5f58c44b5fbb52fa989)
44

5-
EasyR1 version: [v0.3.0](https://github.com/hiyouga/EasyR1/tree/v0.3.0)
5+
EasyR1 version: [v0.3.2](https://github.com/hiyouga/EasyR1/tree/v0.3.2)
66

77
Welcome to contribute new data points!
88

99
## Algorithm Baselines
1010

1111
### [Qwen2.5-Instruct](https://huggingface.co/Qwen/Qwen2.5-7B-Instruct) on [Math12k](https://huggingface.co/datasets/hiyouga/math12k)
1212

13-
| Size | Algorithm | Bits | LR | KL | Test Score |
14-
| ---- | ----------- | ---- | ---- | ---- | ---------- |
15-
| 7B | GRPO | AMP | 1e-6 | 1e-2 | 0.73->0.79 |
13+
| Size | Algorithm | Bits | LR | KL | Test Accuracy |
14+
| ---- | ----------- | ---- | ---- | ---- | -------------------- |
15+
| 7B | GRPO | AMP | 1e-6 | 1e-2 | 0.75 -> 0.77 (+0.02) |
1616

1717
### [Qwen2.5-VL-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) on [Geometry3k](https://huggingface.co/datasets/hiyouga/geometry3k)
1818

19-
| Size | Algorithm | Bits | LR | KL | Test Score |
20-
| ---- | ----------- | ---- | ---- | ---- | ---------- |
21-
| 7B | GRPO | AMP | 1e-6 | 1e-2 | 0.39->0.52 |
22-
| 7B | GRPO | BF16 | 1e-6 | 1e-2 | 0.39->0.52 |
23-
| 7B | GRPO | AMP | 1e-6 | 1e-3 | 0.39->0.52 |
24-
| 7B | RLOO | AMP | 1e-6 | 1e-2 | 0.39->0.53 |
25-
| 3B | GRPO | AMP | 1e-6 | 1e-2 | 0.27->0.44 |
26-
| 32B | GRPO | BF16 | 1e-6 | 1e-2 | 0.46->0.61 |
19+
| Size | Algorithm | Bits | LR | KL | Test Accuracy |
20+
| ---- | ----------- | ---- | ---- | ---- | -------------------- |
21+
| 7B | GRPO | AMP | 1e-6 | 1e-2 | 0.37 -> 0.48 (+0.11) |
22+
| 7B | GRPO | BF16 | 1e-6 | 1e-2 | 0.37 -> 0.48 (+0.11) |
23+
| 7B | DAPO | AMP | 1e-6 | 1e-2 | 0.37 -> 0.50 (+0.13) |
24+
| 3B | GRPO | AMP | 1e-6 | 1e-2 | 0.24 -> 0.38 (+0.14) |
25+
| 32B | GRPO | BF16 | 1e-6 | 1e-2 | 0.50 -> 0.56 (+0.06) |
2726

2827
> [!NOTE]
2928
> The hyper-parameters not listed are all the same as the default values.
@@ -32,21 +31,20 @@ Welcome to contribute new data points!
3231

3332
### [Qwen2.5-VL-Instruct](https://huggingface.co/Qwen/Qwen2.5-VL-7B-Instruct) on [Geometry3k](https://huggingface.co/datasets/hiyouga/geometry3k)
3433

35-
| Size | GPU Type | Bits | Batch Size | vLLM Util | vLLM TP | Peak Mem | Peak VRAM | Throughput | Sec per step | Actor MFU |
36-
| ---- | ------------- | ---- | ---------- | --------- | ------- | -------- | --------- | ---------- | ------------ | --------- |
37-
| 3B | 8 * H100 80GB | AMP | 4 / 16 | 0.6 | 2 | 120GB | 35GB | 1200 | 180s | 6.3% |
38-
| 7B | 8 * H100 80GB | AMP | 4 / 16 | 0.6 | 2 | 140GB | 60GB | 1200 | 180s | 13.6% |
39-
| 7B | 8 * H100 80GB | AMP | 10 / 20 | 0.6 | 2 | 150GB | 75GB | 1400 | 170s | 19.2% |
40-
| 7B | 8 * L20 48GB | AMP | 4 / 16 | 0.6 | 2 | 150GB | 44GB | 410 | 580s | 26.5% |
41-
| 7B | 8 * H100 80GB | BF16 | 4 / 16 | 0.6 | 2 | 150GB | 50GB | 1280 | 190s | 13.9% |
42-
| 32B | 8 * H100 80GB | BF16 | 1 / 8 | 0.6 | 8 | 240GB | 68GB | 360 | 860s | 11.2% |
34+
| Size | GPU Type | Bits | Batch Size | vLLM TP | Peak Mem | Peak VRAM | Throughput | Sec per step | Actor MFU |
35+
| ---- | ------------- | ---- | ---------- | ------- | -------- | --------- | ----------- | ------------ | --------- |
36+
| 3B | 8 * H100 80GB | AMP | 1 / 2 | 2 | 120GB | 54GB | 1800 (+600) | 120s | 8.1% |
37+
| 7B | 8 * H100 80GB | AMP | 1 / 2 | 2 | 120GB | 68GB | 1600 (+400) | 145s | 16.0% |
38+
| 7B | 8 * H100 80GB | AMP | 4 / 8 | 2 | 200GB | 72GB | 2000 (+600) | 120s | 23.2% |
39+
| 7B | 8 * L20 48GB | AMP | 1 / 2 | 2 | 120GB | 42GB | 410 (+0) | 580s | 26.5% |
40+
| 7B | 8 * H100 80GB | BF16 | 1 / 2 | 2 | 120GB | 58GB | 1600 (+320) | 145s | 16.0% |
41+
| 32B | 8 * H100 80GB | BF16 | 1 / 2 | 8 | 260GB | 72GB | 620 (+260) | 530s | 25.8% |
4342

4443
- Batch Size: micro_batch_size_per_device_for_update / micro_batch_size_per_device_for_experience
45-
- vLLM Util: rollout.gpu_memory_utilization
4644
- vLLM TP: rollout.tensor_parallel_size
4745
- Peak Mem: Peak CPU memory usage
4846
- Peak VRAM: Peak GPU memory usage
49-
- Throughput: Number of tokens per second per GPU by one training step
47+
- Throughput: Number of tokens per second per GPU by one training step (including the improvement compared to the previous version)
5048
- Sec per step: Average time per step in seconds
5149

5250
> [!NOTE]

verl/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@
2121
from modelscope.utils.hf_util import patch_hub # type: ignore
2222

2323

24-
__version__ = "0.3.2.dev0"
24+
__version__ = "0.3.2"
2525

2626

2727
if os.getenv("USE_MODELSCOPE_HUB", "0").lower() in ["true", "y", "1"]:

0 commit comments

Comments
 (0)