Skip to content

Commit 7e82c63

Browse files
committed
update link
1 parent 029f00a commit 7e82c63

File tree

4 files changed

+14
-14
lines changed

4 files changed

+14
-14
lines changed

docs/en/programming/vllm.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ For now, we enable vLLM to accelerate policy generation.
66

77
## Model Definition
88

9-
Similar to inheriting `MegatronModule` for implementing [PolicyInference Model](../../../examples/megatron/models/old_policy_inference.py), the vLLM backend can be enabled by inheriting `VLLMModule` class and implementing the following key modules:
9+
Similar to inheriting `MegatronModule` for implementing [PolicyInference Model](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/old_policy_inference.py), the vLLM backend can be enabled by inheriting `VLLMModule` class and implementing the following key modules:
1010
- model_provider: model definition function.
1111
- setup: call model_provider to define model. Optionly, call `load_checkpoint` or others.
1212
- build_dataset: Preprocess train/eval dataset with vLLM tokenizer.
@@ -48,9 +48,9 @@ class VLLMPolicyInference(VLLMModule):
4848
pass
4949
```
5050

51-
You can refer to[vllm_policy_inference.py](../../../examples/megatron/models/vllm_policy_inference.py), in which build_dataset/_add_request/forward_step/decode_internal clarified as following:
51+
You can refer to[vllm_policy_inference.py](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/vllm_policy_inference.py), in which build_dataset/_add_request/forward_step/decode_internal clarified as following:
5252

53-
- build_dataset: Use `tokenizer`, you only need to return prompt_ids and prompt string. In `build_dataset`, [VLLMPromptPipeline](../../../examples/megatron/data/prompt_dataset.py#141) shows as following:
53+
- build_dataset: Use `tokenizer`, you only need to return prompt_ids and prompt string. In `build_dataset`, [VLLMPromptPipeline](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/data/prompt_dataset.py#141) shows as following:
5454
```python
5555
class VLLMPromptPipeline(PromptPipeline):
5656
def __init__(self, prompts: List[str], max_prompt_length: int, tokenizer=None):
@@ -108,7 +108,7 @@ class VLLMPolicyInference(VLLMModule):
108108
return self._forward_step(data, iteration, eval_mode=False)
109109
```
110110

111-
- decode_internal: Refer to [examples](../../../examples/megatron/models/vllm_policy_inference.py#L119) for more details. Format of param `batched_outputs` is List[RequestOutput], in which [RequestOutput](https://github.com/vllm-project/vllm/blob/v0.5.1/vllm/outputs.py#L67)includes the following key attributes:
111+
- decode_internal: Refer to [examples](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/vllm_policy_inference.py#L119) for more details. Format of param `batched_outputs` is List[RequestOutput], in which [RequestOutput](https://github.com/vllm-project/vllm/blob/v0.5.1/vllm/outputs.py#L67)includes the following key attributes:
112112

113113
| Attibute |Type| Comment |
114114
|:------:|:-----:|:-----:|
@@ -140,7 +140,7 @@ policy:
140140
...
141141
```
142142

143-
Or you can refer to [llama2 model yaml](../../../examples/megatron/configs/llama2/vllm_rlhf.yaml).
143+
Or you can refer to [llama2 model yaml](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/vllm_rlhf.yaml).
144144

145145
## hyperparameter configuration yaml
146146

@@ -186,4 +186,4 @@ Hyperparameter for vLLM can be divied into 5 parts:
186186
- Others: `includes` specifies model structure.
187187

188188

189-
You can refer to [vLLM Hyperparameter Configuration](../../../examples/megatron/configs/llama2/vllm_policy_inference.yaml) for details.
189+
You can refer to [vLLM Hyperparameter Configuration](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/vllm_policy_inference.yaml) for details.

docs/en/tutorial/ems.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,4 +26,4 @@ Alternatively, it can also be configured in the training script using environmen
2626
- PPO policy model: `export free_memory_ppo_policy=True`
2727
- PPO value model: `export free_memory_ppo_value=True`
2828

29-
A complete example can be found in the [llama2 configuration](../../../examples/megatron/configs/llama2/rlhf.yaml).
29+
A complete example can be found in the [llama2 configuration](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/rlhf.yaml).

docs/zh/programming/vllm.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ChatLearn中支持vLLM进行跨机分布式推理,支持vllm和training backen
66

77
## 模型定义
88

9-
类似于继承`MegatronModule`实现[PolicyInference模型](../../../examples/megatron/models/old_policy_inference.py),PolicyInference模型若想基于vLLM后端完成generation,需要继承`VLLMModule`父类,实现以下关键模块:
9+
类似于继承`MegatronModule`实现[PolicyInference模型](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/old_policy_inference.py),PolicyInference模型若想基于vLLM后端完成generation,需要继承`VLLMModule`父类,实现以下关键模块:
1010
- model_provider:模型定义函数。
1111
- setup:调用model_provider定义模型,可根据需要决定是否load_checkpoint等。
1212
- build_dataset:调用vLLM tokenizer处理数据,生成prompt dataset。
@@ -48,9 +48,9 @@ class VLLMPolicyInference(VLLMModule):
4848
pass
4949
```
5050

51-
示例可参考[vllm_policy_inference.py](../../../examples/megatron/models/vllm_policy_inference.py),补充说明build_dataset、_add_request、forward_step、decode_internal如下:
51+
示例可参考[vllm_policy_inference.py](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/vllm_policy_inference.py),补充说明build_dataset、_add_request、forward_step、decode_internal如下:
5252

53-
- build_dataset:调用tokenizer处理只需要返回prompt_ids、prompt str,其中build_dataset的[VLLMPromptPipeline](../../../examples/megatron/data/prompt_dataset.py#141)具体逻辑如下:
53+
- build_dataset:调用tokenizer处理只需要返回prompt_ids、prompt str,其中build_dataset的[VLLMPromptPipeline](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/data/prompt_dataset.py#141)具体逻辑如下:
5454
```python
5555
class VLLMPromptPipeline(PromptPipeline):
5656
def __init__(self, prompts: List[str], max_prompt_length: int, tokenizer=None):
@@ -108,7 +108,7 @@ class VLLMPolicyInference(VLLMModule):
108108
return self._forward_step(data, iteration, eval_mode=False)
109109
```
110110

111-
- decode_internal:可参考[examples](../../../examples/megatron/models/vllm_policy_inference.py#L119)实现。参数batched_outputs格式为List[RequestOutput],其中[RequestOutput](https://github.com/vllm-project/vllm/blob/v0.5.1/vllm/outputs.py#L67)包含以下重要attributes:
111+
- decode_internal:可参考[examples](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/models/vllm_policy_inference.py#L119)实现。参数batched_outputs格式为List[RequestOutput],其中[RequestOutput](https://github.com/vllm-project/vllm/blob/v0.5.1/vllm/outputs.py#L67)包含以下重要attributes:
112112

113113
| 属性 |类型| 含义 |
114114
|:------:|:-----:|:-----:|
@@ -138,7 +138,7 @@ policy:
138138
model_config_file: vllm_policy_inference.yaml
139139
...
140140
```
141-
也可以参考示例 [llama2模型配置](../../../examples/megatron/configs/llama2/vllm_rlhf.yaml)
141+
也可以参考示例 [llama2模型配置](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/vllm_rlhf.yaml)
142142

143143
## 超参配置
144144

@@ -182,4 +182,4 @@ vLLM超参可分为五部分:
182182
- tokenizer:vLLM tokenizer读取目录,可参考[LLama2-7B-hf](https://huggingface.co/meta-llama/Llama-2-7b)
183183
- 其他:includes指定模型结构等其余参数;
184184

185-
可以参考 [vLLM超参配置](../../../examples/megatron/configs/llama2/vllm_policy_inference.yaml)
185+
可以参考 [vLLM超参配置](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/vllm_policy_inference.yaml)

docs/zh/tutorial/ems.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,4 +29,4 @@ policy:
2929
- ppo_policy 模型:`export free_memory_ppo_policy=True`
3030
- ppo_value 模型:`export free_memory_ppo_value=True`
3131

32-
完整示例可以参考 [llama2 配置](../../../examples/megatron/configs/llama2/rlhf.yaml)。
32+
完整示例可以参考 [llama2 配置](https://github.com/alibaba/ChatLearn/blob/main/examples/megatron/configs/llama2/rlhf.yaml)。

0 commit comments

Comments
 (0)