Skip to content

Commit 5b4dae2

Browse files
committed
update readme
1 parent 2b798c0 commit 5b4dae2

File tree

2 files changed

+21
-18
lines changed

2 files changed

+21
-18
lines changed

README.md

+15-13
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ Currently, the features supported by OpenRL include:
5858

5959
- Reinforcement learning training support for natural language tasks (such as dialogue)
6060

61+
- Support [DeepSpeed](https://github.com/microsoft/DeepSpeed)
62+
6163
- Support [Arena](https://openrl-docs.readthedocs.io/en/latest/arena/index.html) , which allows convenient evaluation of
6264
various agents (even submissions for [JiDi](https://openrl-docs.readthedocs.io/en/latest/arena/index.html#performing-local-evaluation-of-agents-submitted-to-the-jidi-platform-using-openrl)) in a competitive environment.
6365

@@ -160,19 +162,19 @@ Here we provide a table for the comparison of OpenRL and existing popular RL lib
160162
OpenRL employs a modular design and high-level abstraction, allowing users to accomplish training for various tasks
161163
through a unified and user-friendly interface.
162164

163-
| Library | NLP/RLHF | Multi-agent | Self-Play Training | Offline RL | Bilingual Document |
164-
|:------------------------------------------------------------------:|:------------------:|:--------------------:|:--------------------:|:------------------:|:------------------:|
165-
| **[OpenRL](https://github.com/OpenRL-Lab/openrl)** | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
166-
| [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) | :x: | :x: | :x: | :x: | :x: |
167-
| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
168-
| [DI-engine](https://github.com/opendilab/DI-engine/) | :x: | :heavy_check_mark: | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
169-
| [Tianshou](https://github.com/thu-ml/tianshou) | :x: | not fullly supported | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
170-
| [MARLlib](https://github.com/Replicable-MARL/MARLlib) | :x: | :heavy_check_mark: | not fullly supported | :x: | :x: |
171-
| [MAPPO Benchmark](https://github.com/marlbenchmark/on-policy) | :x: | :heavy_check_mark: | :x: | :x: | :x: |
172-
| [RL4LMs](https://github.com/allenai/RL4LMs) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
173-
| [trlx](https://github.com/CarperAI/trlx) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
174-
| [trl](https://github.com/huggingface/trl) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
175-
| [TimeChamber](https://github.com/inspirai/TimeChamber) | :x: | :x: | :heavy_check_mark: | :x: | :x: |
165+
| Library | NLP/RLHF | Multi-agent | Self-Play Training | Offline RL | [DeepSpeed](https://github.com/microsoft/DeepSpeed) |
166+
|:------------------------------------------------------------------:|:------------------:|:--------------------:|:--------------------:|:------------------:|:--------------------:|
167+
| **[OpenRL](https://github.com/OpenRL-Lab/openrl)** | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
168+
| [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) | :x: | :x: | :x: | :x: | :x: |
169+
| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
170+
| [DI-engine](https://github.com/opendilab/DI-engine/) | :x: | :heavy_check_mark: | not fullly supported | :heavy_check_mark: | :x: |
171+
| [Tianshou](https://github.com/thu-ml/tianshou) | :x: | not fullly supported | not fullly supported | :heavy_check_mark: | :x: |
172+
| [MARLlib](https://github.com/Replicable-MARL/MARLlib) | :x: | :heavy_check_mark: | not fullly supported | :x: | :x: |
173+
| [MAPPO Benchmark](https://github.com/marlbenchmark/on-policy) | :x: | :heavy_check_mark: | :x: | :x: | :x: |
174+
| [RL4LMs](https://github.com/allenai/RL4LMs) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
175+
| [trlx](https://github.com/CarperAI/trlx) | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
176+
| [trl](https://github.com/huggingface/trl) | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
177+
| [TimeChamber](https://github.com/inspirai/TimeChamber) | :x: | :x: | :heavy_check_mark: | :x: | :x: |
176178

177179
## Installation
178180

README_zh.md

+6-5
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@ OpenRL基于PyTorch进行开发,目标是为强化学习研究社区提供一
5151
- 支持通过专家数据进行离线强化学习训练
5252
- 支持自博弈训练
5353
- 支持自然语言任务(如对话任务)的强化学习训练
54+
- 支持[DeepSpeed](https://github.com/microsoft/DeepSpeed)
5455
- 支持[竞技场](https://openrl-docs.readthedocs.io/zh/latest/arena/index.html)功能,可以在多智能体对抗性环境中方便地对各种智能体(甚至是[及第平台](https://openrl-docs.readthedocs.io/zh/latest/arena/index.html#openrl)上提交的智能体)进行评测。
5556
- 支持从[Hugging Face](https://huggingface.co/)上导入模型和数据。支持加载Hugging Face上[Stable-baselines3的模型](https://openrl-docs.readthedocs.io/zh/latest/sb3/index.html)来进行测试和训练。
5657
- 提供用户自有环境接入OpenRL的[详细教程](https://openrl-docs.readthedocs.io/zh/latest/custom_env/index.html).
@@ -128,18 +129,18 @@ OpenRL-Lab将持续维护和更新OpenRL,欢迎大家加入我们的[开源社
128129

129130
这里我们提供了一个表格,比较了OpenRL和其他常用的强化学习库。 OpenRL采用模块化设计和高层次的抽象,使得用户可以通过统一的简单易用的接口完成各种任务的训练。
130131

131-
| 强化学习库 | 自然语言任务/RLHF | 多智能体训练 | 自博弈训练 | 离线强化学习 | 双语文档 |
132+
| 强化学习库 | 自然语言任务/RLHF | 多智能体训练 | 自博弈训练 | 离线强化学习 | [DeepSpeed](https://github.com/microsoft/DeepSpeed) |
132133
|:------------------------------------------------------------------:|:------------------:|:--------------------:|:--------------------:|:------------------:|:------------------:|
133134
| **[OpenRL](https://github.com/OpenRL-Lab/openrl)** | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |
134135
| [Stable Baselines3](https://github.com/DLR-RM/stable-baselines3) | :x: | :x: | :x: | :x: | :x: |
135136
| [Ray/RLlib](https://github.com/ray-project/ray/tree/master/rllib/) | :x: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :x: |
136-
| [DI-engine](https://github.com/opendilab/DI-engine/) | :x: | :heavy_check_mark: | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
137-
| [Tianshou](https://github.com/thu-ml/tianshou) | :x: | not fullly supported | not fullly supported | :heavy_check_mark: | :heavy_check_mark: |
137+
| [DI-engine](https://github.com/opendilab/DI-engine/) | :x: | :heavy_check_mark: | not fullly supported | :heavy_check_mark: | :x: |
138+
| [Tianshou](https://github.com/thu-ml/tianshou) | :x: | not fullly supported | not fullly supported | :heavy_check_mark: | :x: |
138139
| [MARLlib](https://github.com/Replicable-MARL/MARLlib) | :x: | :heavy_check_mark: | not fullly supported | :x: | :x: |
139140
| [MAPPO Benchmark](https://github.com/marlbenchmark/on-policy) | :x: | :heavy_check_mark: | :x: | :x: | :x: |
140141
| [RL4LMs](https://github.com/allenai/RL4LMs) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
141-
| [trlx](https://github.com/CarperAI/trlx) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
142-
| [trl](https://github.com/huggingface/trl) | :heavy_check_mark: | :x: | :x: | :x: | :x: |
142+
| [trlx](https://github.com/CarperAI/trlx) | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
143+
| [trl](https://github.com/huggingface/trl) | :heavy_check_mark: | :x: | :x: | :x: | :heavy_check_mark: |
143144
| [TimeChamber](https://github.com/inspirai/TimeChamber) | :x: | :x: | :heavy_check_mark: | :x: | :x: |
144145

145146
## 安装

0 commit comments

Comments
 (0)