Skip to content
This repository was archived by the owner on Dec 11, 2022. It is now read-only.

Commit 2697142

Browse files
authored
Release 1.0.0 (#382)
* Updating README * Shortening test cycles
1 parent 718597c commit 2697142

File tree

5 files changed

+46
-38
lines changed

5 files changed

+46
-38
lines changed

.circleci/config.yml

+10-9
Original file line numberDiff line numberDiff line change
@@ -731,18 +731,19 @@ workflows:
731731
- functional_tests:
732732
requires:
733733
- build_base
734-
- functional_test_doom:
735-
requires:
736-
- build_doom_env
737-
- functional_tests
738-
- functional_test_mujoco:
739-
requires:
740-
- build_mujoco_env
741-
- functional_test_doom
734+
# - functional_test_doom:
735+
# requires:
736+
# - build_doom_env
737+
# - functional_tests
738+
# - functional_test_mujoco:
739+
# requires:
740+
# - build_mujoco_env
741+
# - functional_test_doom
742742
- golden_test_gym:
743743
requires:
744744
- build_gym_env
745-
- functional_test_mujoco
745+
# - functional_test_mujoco
746+
- functional_tests
746747
- golden_test_doom:
747748
requires:
748749
- build_doom_env

CONTRIBUTING.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Coach is released as two pypi packages:
5454

5555
Each pypi package release has a GitHub release and tag with the same version number. The numbers are of the X.Y.Z format, where
5656

57-
X - zero in the near future, may change when Coach is feature complete
57+
X - currently one, will be incremented on major API changes
5858

5959
Y - major releases with new features
6060

README.md

+32-27
Original file line numberDiff line numberDiff line change
@@ -29,20 +29,23 @@ coach -p CartPole_DQN -r
2929
* [Release 0.9.0](https://ai.intel.com/reinforcement-learning-coach-carla-qr-dqn/)
3030
* [Release 0.10.0](https://ai.intel.com/introducing-reinforcement-learning-coach-0-10-0/)
3131
* [Release 0.11.0](https://ai.intel.com/rl-coach-data-science-at-scale)
32-
* Release 0.12.0 (current release)
32+
* [Release 0.12.0](https://github.com/NervanaSystems/coach/releases/tag/v0.12.0)
33+
* Release 1.0.0 (current release)
3334

34-
Contacting the Coach development team is also possible through the email [[email protected]]([email protected])
35+
Contacting the Coach development team is also possible over [email](mailto:[email protected])
3536

3637
## Table of Contents
3738

3839
- [Coach](#coach)
39-
* [Overview](#overview)
4040
* [Benchmarks](#benchmarks)
41-
* [Documentation](#documentation)
4241
* [Installation](#installation)
43-
* [Usage](#usage)
44-
+ [Running Coach](#running-coach)
45-
+ [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
42+
* [Getting Started](#getting-started)
43+
* [Tutorials and Documentation](#tutorials-and-documentation)
44+
* [Basic Usage](#basic-usage)
45+
* [Running Coach](#running-coach)
46+
* [Running Coach Dashboard (Visualization)](#running-coach-dashboard-visualization)
47+
* [Distributed Multi-Node Coach](#distributed-multi-node-coach)
48+
* [Batch Reinforcement Learning](#batch-reinforcement-learning)
4649
* [Supported Environments](#supported-environments)
4750
* [Supported Algorithms](#supported-algorithms)
4851
* [Citation](#citation)
@@ -52,13 +55,6 @@ Contacting the Coach development team is also possible through the email [coach@
5255

5356
One of the main challenges when building a research project, or a solution based on a published algorithm, is getting a concrete and reliable baseline that reproduces the algorithm's results, as reported by its authors. To address this problem, we are releasing a set of [benchmarks](benchmarks) that shows Coach reliably reproduces many state of the art algorithm results.
5457

55-
## Documentation
56-
57-
Framework documentation, algorithm description and instructions on how to contribute a new agent/environment can be found [here](https://nervanasystems.github.io/coach/).
58-
59-
Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment can be found [here](https://github.com/NervanaSystems/coach/tree/master/tutorials).
60-
61-
6258
## Installation
6359

6460
Note: Coach has only been tested on Ubuntu 16.04 LTS, and with Python 3.5.
@@ -113,9 +109,16 @@ If a GPU is present, Coach's pip package will install tensorflow-gpu, by default
113109

114110
In addition to OpenAI Gym, several other environments were tested and are supported. Please follow the instructions in the Supported Environments section below in order to install more environments.
115111

116-
## Usage
112+
## Getting Started
113+
114+
### Tutorials and Documentation
115+
[Jupyter notebooks demonstrating how to run Coach from command line or as a library, implement an algorithm, or integrate an environment](https://github.com/NervanaSystems/coach/tree/master/tutorials).
116+
117+
[Framework documentation, algorithm description and instructions on how to contribute a new agent/environment](https://nervanasystems.github.io/coach/).
118+
119+
### Basic Usage
117120

118-
### Running Coach
121+
#### Running Coach
119122

120123
To allow reproducing results in Coach, we defined a mechanism called _preset_.
121124
There are several available presets under the `presets` directory.
@@ -167,17 +170,7 @@ It is easy to create new presets for different levels or environments by followi
167170

168171
More usage examples can be found [here](https://github.com/NervanaSystems/coach/blob/master/tutorials/0.%20Quick%20Start%20Guide.ipynb).
169172

170-
### Distributed Multi-Node Coach
171-
172-
As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
173-
For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
174-
175-
### Batch Reinforcement Learning
176-
177-
Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach.
178-
There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb).
179-
180-
### Running Coach Dashboard (Visualization)
173+
#### Running Coach Dashboard (Visualization)
181174
Training an agent to solve an environment can be tricky, at times.
182175

183176
In order to debug the training process, Coach outputs several signals, per trained algorithm, in order to track algorithmic performance.
@@ -195,6 +188,17 @@ dashboard
195188
<img src="img/dashboard.gif" alt="Coach Design" style="width: 800px;"/>
196189

197190

191+
### Distributed Multi-Node Coach
192+
193+
As of release 0.11.0, Coach supports horizontal scaling for training RL agents on multiple nodes. In release 0.11.0 this was tested on the ClippedPPO and DQN agents.
194+
For usage instructions please refer to the documentation [here](https://nervanasystems.github.io/coach/dist_usage.html).
195+
196+
### Batch Reinforcement Learning
197+
198+
Training and evaluating an agent from a dataset of experience, where no simulator is available, is supported in Coach.
199+
There are [example](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/CartPole_DDQN_BatchRL.py) [presets](https://github.com/NervanaSystems/coach/blob/master/rl_coach/presets/Acrobot_DDQN_BCQ_BatchRL.py) and a [tutorial](https://github.com/NervanaSystems/coach/blob/master/tutorials/4.%20Batch%20Reinforcement%20Learning.ipynb).
200+
201+
198202
## Supported Environments
199203

200204
* *OpenAI Gym:*
@@ -285,6 +289,7 @@ dashboard
285289
* [Generalized Advantage Estimation (GAE)](https://arxiv.org/abs/1506.02438) ([code](rl_coach/agents/actor_critic_agent.py#L86))
286290
* [Sample Efficient Actor-Critic with Experience Replay (ACER)](https://arxiv.org/abs/1611.01224) | **Multi Worker Single Node** ([code](rl_coach/agents/acer_agent.py))
287291
* [Soft Actor-Critic (SAC)](https://arxiv.org/abs/1801.01290) ([code](rl_coach/agents/soft_actor_critic_agent.py))
292+
* [Twin Delayed Deep Deterministic Policy Gradient](https://arxiv.org/pdf/1802.09477.pdf) ([code](rl_coach/agents/td3_agent.py))
288293
289294
### General Agents
290295
* [Direct Future Prediction (DFP)](https://arxiv.org/abs/1611.01779) | **Multi Worker Single Node** ([code](rl_coach/agents/dfp_agent.py))

rl_coach/tests/pytest.ini

+2
Original file line numberDiff line numberDiff line change
@@ -5,3 +5,5 @@ markers =
55
integration_test: long test that checks that the complete framework is running correctly
66
filterwarnings =
77
ignore::DeprecationWarning
8+
norecursedirs =
9+
*mxnet*

setup.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@
8585

8686
setup(
8787
name='rl-coach' if not slim_package else 'rl-coach-slim',
88-
version='0.12.1',
88+
version='1.0.0',
8989
description='Reinforcement Learning Coach enables easy experimentation with state of the art Reinforcement Learning algorithms.',
9090
url='https://github.com/NervanaSystems/coach',
9191
author='Intel AI Lab',

0 commit comments

Comments
 (0)