Merge branch 'Farama-Foundation:master' into SB3-supersuit-bugfix

Farama-Foundation · Jul 20, 2023 · 5fc45f1 · 5fc45f1
2 parents 7452542 + 6a20a32
commit 5fc45f1
Show file tree

Hide file tree

Showing 12 changed files with 418 additions and 26 deletions.
diff --git a/.github/workflows/linux-test.yml b/.github/workflows/linux-test.yml
@@ -17,7 +17,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']
+        python-version: ['3.8', '3.9', '3.10', '3.11']
     steps:
       - uses: actions/checkout@v3
       # - uses: openrndr/[email protected]

diff --git a/.github/workflows/linux-tutorials-test.yml b/.github/workflows/linux-tutorials-test.yml
@@ -18,7 +18,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: ['3.7', '3.8', '3.9', '3.10', '3.11']
+        python-version: ['3.8', '3.9', '3.10', '3.11']
         tutorial: ['Tianshou', 'EnvironmentCreation', 'CleanRL', 'SB3/kaz', 'SB3/waterworld', 'SB3/connect_four', 'SB3/test'] # TODO: add back RLlib once it is fixed
     steps:
       - uses: actions/checkout@v3
@@ -33,5 +33,6 @@ jobs:
           cd tutorials/${{ matrix.tutorial }}
           pip install -r requirements.txt
           pip uninstall -y pettingzoo
-          pip install -e $root_dir
+          pip install -e $root_dir[testing]
+          AutoROM -v
           for f in *.py; do xvfb-run -a -s "-screen 0 1024x768x24" python "$f"; done
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -50,7 +50,7 @@ repos:
     rev: v3.3.2
     hooks:
       - id: pyupgrade
-        args: ["--py37-plus"]
+        args: ["--py38-plus"]
   - repo: https://github.com/pycqa/pydocstyle
     rev: 6.3.0
     hooks:
@@ -60,7 +60,7 @@ repos:
           - --explain
           - --convention=google
           - --count
-          # TODO: Remove ignoring rules D101, D102, D103, D105
+          # TODO: Remove ignoring rules D101, D102, D103, D105 (add docstrings to all public methods)
           - --add-ignore=D100,D107,D101,D102,D103,D105
         exclude: "__init__.py$|^pettingzoo.test|^docs"
         additional_dependencies: ["tomli"]

diff --git a/README.md b/README.md
@@ -26,7 +26,7 @@ This does not include dependencies for all families of environments (some enviro
 
 To install the dependencies for one family, use `pip install pettingzoo[atari]`, or use `pip install pettingzoo[all]` to install all dependencies.
 
-We support Python 3.7, 3.8, 3.9 and 3.10 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
+We support Python 3.8, 3.9, 3.10 and 3.11 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
 
 ## Getting started
 

diff --git a/docs/content/basic_usage.md b/docs/content/basic_usage.md
@@ -11,7 +11,7 @@ This does not include dependencies for all families of environments (some enviro
 
 To install the dependencies for one family, use `pip install pettingzoo[atari]`, or use `pip install pettingzoo[all]` to install all dependencies.
 
-We support Python 3.7, 3.8, 3.9 and 3.10 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
+We support Python 3.8, 3.9, 3.10 and 3.11 on Linux and macOS. We will accept PRs related to Windows, but do not officially support it.
 
 ## Initializing Environments
 

diff --git a/docs/tutorials/cleanrl/advanced_PPO.md b/docs/tutorials/cleanrl/advanced_PPO.md
@@ -0,0 +1,26 @@
+---
+title: "CleanRL: Advanced PPO"
+---
+
+# CleanRL: Advanced PPO
+
+This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on [Atari](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environments ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This is a full training script including CLI, logging and integration with [TensorBoard](https://www.tensorflow.org/tensorboard) and [WandB](https://wandb.ai/) for experiment tracking.
+
+This tutorial is mirrored from [CleanRL](https://github.com/vwxyzjn/cleanrl)'s examples. Full documentation and experiment results can be found at [https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)
+
+## Environment Setup
+To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
+```{eval-rst}
+.. literalinclude:: ../../../tutorials/CleanRL/requirements.txt
+   :language: text
+```
+
+Then, install ROMs using [AutoROM](https://github.com/Farama-Foundation/AutoROM), or specify the path to your Atari rom using the `rom_path` argument (see [Common Parameters](/environments/atari/#common-parameters)).
+
+## Code
+The following code should run without any issues. The comments are designed to help you understand how to use PettingZoo with CleanRL. If you have any questions, please feel free to ask in the [Discord server](https://discord.gg/nhvKkYa6qX), or create an issue on [CleanRL's GitHub](https://github.com/vwxyzjn/cleanrl/issues).
+```{eval-rst}
+.. literalinclude:: ../../../tutorials/CleanRL/cleanrl_advanced.py
+   :language: python
+```
diff --git a/docs/tutorials/cleanrl/implementing_PPO.md b/docs/tutorials/cleanrl/implementing_PPO.md
@@ -4,7 +4,7 @@ title: "CleanRL: Implementing PPO"
 
 # CleanRL: Implementing PPO
 
-This tutorial shows how to train a [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agennt on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
+This tutorial shows how to train [PPO](https://docs.cleanrl.dev/rl-algorithms/ppo/) agents on the [Pistonball](https://pettingzoo.farama.org/environments/butterfly/pistonball/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).
 
 ## Environment Setup
 To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.

diff --git a/docs/tutorials/cleanrl/index.md b/docs/tutorials/cleanrl/index.md
@@ -6,7 +6,9 @@ title: "CleanRL"
 
 This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to implement a training algorithm from scratch and train it on the Pistonball environment.
 
-* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Implement and train an agent using PPO_
+* [Implementing PPO](/tutorials/cleanrl/implementing_PPO.md): _Train an agent using a simple PPO implementation_
+
+* [Advanced PPO](/tutorials/cleanrl/advanced_PPO.md): _CleanRL's official PPO example, with CLI, TensorBoard and WandB integration_
 
 
 ## CleanRL Overview
@@ -16,14 +18,14 @@ This tutorial shows how to use [CleanRL](https://github.com/vwxyzjn/cleanrl) to
 
 See the [documentation](https://docs.cleanrl.dev/) for more information.
 
-## Official examples using PettingZoo:
+## Examples using PettingZoo:
 
 * [PPO PettingZoo Atari example](https://docs.cleanrl.dev/rl-algorithms/ppo/#ppo_pettingzoo_ma_ataripy)
 
 
 ## WandB Integration
 
-A key feature is its tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
+A key feature is CleanRL's tight integration with [Weights & Biases](https://wandb.ai/) (WandB): for experiment tracking, hyperparameter tuning, and benchmarking.
 The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allows users to view public leaderboards for many tasks, including videos of agents' performance across training timesteps.
 
 
@@ -38,4 +40,5 @@ The [Open RL Benchmark](https://github.com/openrlbenchmark/openrlbenchmark) allo
 :caption: CleanRL
 
 implementing_PPO
+advanced_PPO
 ```
diff --git a/docs/tutorials/sb3/index.md b/docs/tutorials/sb3/index.md
@@ -34,25 +34,13 @@ For non-visual environments, we use [MLP](https://stable-baselines3.readthedocs.
 ```
 
 
-
-
 ## Stable-Baselines Overview
 
 [Stable-Baselines3](https://stable-baselines3.readthedocs.io/en/master/) (SB3) is a library providing reliable implementations of reinforcement learning algorithms in [PyTorch](https://pytorch.org/). It provides a clean and simple interface, giving you access to off-the-shelf state-of-the-art model-free RL algorithms. It allows training of RL agents with only a few lines of code.
 
 For more information, see the [Stable-Baselines3 v1.0 Blog Post](https://araffin.github.io/post/sb3/)
 
 
-[//]: # (```{eval-rst})
-
-[//]: # (.. warning::)
-
-[//]: # ()
-[//]: # (    Note: SB3 is designed for single-agent RL and does not plan on natively supporting multi-agent PettingZoo environments. These tutorials are only intended for demonstration purposes, to show how SB3 can be adapted to work in multi-agent settings.)
-
-[//]: # (```)
-
-
 ```{figure} https://raw.githubusercontent.com/DLR-RM/stable-baselines3/master/docs/_static/img/logo.png
     :alt: SB3 Logo
     :width: 80%

diff --git a/pyproject.toml b/pyproject.toml
@@ -8,15 +8,14 @@ build-backend = "setuptools.build_meta"
 name = "pettingzoo"
 description = "Gymnasium for multi-agent reinforcement learning."
 readme = "README.md"
-requires-python = ">= 3.7"
+requires-python = ">= 3.8"
 authors = [{ name = "Farama Foundation", email = "[email protected]" }]
 license = { text = "MIT License" }
 keywords = ["Reinforcement Learning", "game", "RL", "AI", "gymnasium"]
 classifiers = [
     "Development Status :: 4 - Beta",  # change to `5 - Production/Stable` when ready
     "License :: OSI Approved :: MIT License",
     "Programming Language :: Python :: 3",
-    "Programming Language :: Python :: 3.7",
     "Programming Language :: Python :: 3.8",
     "Programming Language :: Python :: 3.9",
     "Programming Language :: Python :: 3.10",