Skip to content

Commit

Permalink
Merge branch 'Farama-Foundation:master' into SB3-supersuit-bugfix
Browse files Browse the repository at this point in the history
  • Loading branch information
elliottower authored Jul 20, 2023
2 parents 7452542 + 6a20a32 commit 5fc45f1
Show file tree
Hide file tree
Showing 12 changed files with 418 additions and 26 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/linux-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']
python-version: ['3.8', '3.9', '3.10', '3.11']
steps:
- uses: actions/checkout@v3
# - uses: openrndr/[email protected]
Expand Down
5 changes: 3 additions & 2 deletions .github/workflows/linux-tutorials-test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']
python-version: ['3.8', '3.9', '3.10', '3.11']
tutorial: ['Tianshou', 'EnvironmentCreation', 'CleanRL', 'SB3/kaz', 'SB3/waterworld', 'SB3/connect_four', 'SB3/test'] # TODO: add back RLlib once it is fixed
steps:
- uses: actions/checkout@v3
Expand All @@ -33,5 +33,6 @@ jobs:
cd tutorials/${{ matrix.tutorial }}
pip install -r requirements.txt
pip uninstall -y pettingzoo
pip install -e $root_dir
pip install -e $root_dir[testing]
AutoROM -v
for f in *.py; do xvfb-run -a -s "-screen 0 1024x768x24" python "$f"; done
4 changes: 2 additions & 2 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ repos:
rev: v3.3.2
hooks:
- id: pyupgrade
args: ["--py37-plus"]
args: ["--py38-plus"]
- repo: https://github.com/pycqa/pydocstyle
rev: 6.3.0
hooks:
Expand All @@ -60,7 +60,7 @@ repos:
- --explain
- --convention=google
- --count
# TODO: Remove ignoring rules D101, D102, D103, D105
# TODO: Remove ignoring rules D101, D102, D103, D105 (add docstrings to all public methods)
- --add-ignore=D100,D107,D101,D102,D103,D105
exclude: "__init__.py$|^pettingzoo.test|^docs"
additional_dependencies: ["tomli"]
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ This does not include dependencies for all families of environments (some enviro

To install the dependencies for one family, use `pip install pettingzoo[atari]`, or use `pip install pettingzoo[all]` to install all dependencies.

We support Python 3.7, 3.8, 3.9 and 3.10 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
We support Python 3.8, 3.9, 3.10 and 3.11 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.

## Getting started

Expand Down
2 changes: 1 addition & 1 deletion docs/content/basic_usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ This does not include dependencies for all families of environments (some enviro

To install the dependencies for one family, use `pip install pettingzoo[atari]`, or use `pip install pettingzoo[all]` to install all dependencies.

We support Python 3.7, 3.8, 3.9 and 3.10 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
We support Python 3.8, 3.9, 3.10 and 3.11 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.

## Initializing Environments

Expand Down
26 changes: 26 additions & 0 deletions docs/tutorials/cleanrl/advanced_PPO.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
---
title: "CleanRL: Advanced PPO"
---

# CleanRL: Advanced PPO

This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on [Atari](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environments ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
This is a full training script including CLI, logging and integration with [TensorBoard](https://www.tensorflow.org/tensorboard) and [WandB](https://wandb.ai/) for experiment tracking.

This tutorial is mirrored from [CleanRL](https://github.com/vwxyzjn/cleanrl)'s examples. Full documentation and experiment results can be found at [https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)

## Environment Setup
To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
```{eval-rst}
.. literalinclude:: ../../../tutorials/CleanRL/requirements.txt
:language: text
```

Then, install ROMs using [AutoROM](https://github.com/Farama-Foundation/AutoROM), or specify the path to your Atari rom using the `rom_path` argument (see [Common Parameters](/environments/atari/#common-parameters)).

## Code
The following code should run without any issues. The comments are designed to help you understand how to use PettingZoo with CleanRL. If you have any questions, please feel free to ask in the [Discord server](https://discord.gg/nhvKkYa6qX), or create an issue on [CleanRL's GitHub](https://github.com/vwxyzjn/cleanrl/issues).
```{eval-rst}
.. literalinclude:: ../../../tutorials/CleanRL/cleanrl_advanced.py
:language: python
```
2 changes: 1 addition & 1 deletion docs/tutorials/cleanrl/implementing_PPO.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: "CleanRL: Implementing PPO"

# CleanRL: Implementing PPO

This tutorial shows how to train a [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agennt on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).

## Environment Setup
To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
Expand Down
9 changes: 6 additions & 3 deletions docs/tutorials/cleanrl/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,9 @@ title: "CleanRL"

This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to implement a training algorithm from scratch and train it on the Pistonball environment.

* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Implement and train an agent using PPO_
* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Train an agent using a simple PPO implementation_

* [Advanced PPO](/tutorials/cleanrl/advanced_PPO.md): _CleanRL's official PPO example, with CLI, TensorBoard and WandB integration_


## CleanRL Overview
Expand All @@ -16,14 +18,14 @@ This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to

See the [documentation](https://docs.cleanrl.dev/) for more information.

## Official examples using PettingZoo:
## Examples using PettingZoo:

* [PPO PettingZoo Atari example](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)


## WandB Integration

A key feature is its tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
A key feature is CleanRL's tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allows users to view public leaderboards for many tasks, including videos of agents' performance across training timesteps.


Expand All @@ -38,4 +40,5 @@ The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allo
:caption: CleanRL
implementing_PPO
advanced_PPO
```
12 changes: 0 additions & 12 deletions docs/tutorials/sb3/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,25 +34,13 @@ For non-visual environments, we use [MLP](https://stable-baselines3.readthedocs.
```




## Stable-Baselines Overview

[Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) (SB3) is a library providing reliable implementations of reinforcement learning algorithms in [PyTorch](https://pytorch.org/). It provides a clean and simple interface, giving you access to off-the-shelf state-of-the-art model-free RL algorithms. It allows training of RL agents with only a few lines of code.

For more information, see the [Stable-Baselines3 v1.0 Blog Post](https://araffin.github.io/post/sb3/)


[//]: # (```{eval-rst})

[//]: # (.. warning::)

[//]: # ()
[//]: # ( Note: SB3 is designed for single-agent RL and does not plan on natively supporting multi-agent PettingZoo environments. These tutorials are only intended for demonstration purposes, to show how SB3 can be adapted to work in multi-agent settings.)

[//]: # (```)


```{figure} https://raw.githubusercontent.com/DLR-RM/stable-baselines3/master/docs/_static/img/logo.png
:alt: SB3 Logo
:width: 80%
Expand Down
3 changes: 1 addition & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -8,15 +8,14 @@ build-backend = "setuptools.build_meta"
name = "pettingzoo"
description = "Gymnasium for multi-agent reinforcement learning."
readme = "README.md"
requires-python = ">= 3.7"
requires-python = ">= 3.8"
authors = [{ name = "Farama Foundation", email = "[email protected]" }]
license = { text = "MIT License" }
keywords = ["Reinforcement Learning", "game", "RL", "AI", "gymnasium"]
classifiers = [
"Development Status :: 4 - Beta", # change to `5 - Production/Stable` when ready
"License :: OSI Approved :: MIT License",
"Programming Language :: Python :: 3",
"Programming Language :: Python :: 3.7",
"Programming Language :: Python :: 3.8",
"Programming Language :: Python :: 3.9",
"Programming Language :: Python :: 3.10",
Expand Down
Loading

0 comments on commit 5fc45f1

Please sign in to comment.