Skip to content

Commit 6af0601

Browse files
j0m0k0araffin
andauthored
Update LunarLander and LunarLanderContinuous Environments from v2 to v3 in the Documentation (#2143)
* Update environment versions in the examples documentation The version 2 of LunarLander and LunarLanderContinuous environments are deprecated by gymnasium. * Update integrations.rst to support version 3 of LunarLander * Update changelog.rst to include documentation changes for LunarLander and LunarLanderContinuous env versions * Update docs/misc/changelog.rst * Fix for newer mypy version * Downgrade ale-py for gymnasium<1 --------- Co-authored-by: Antonin RAFFIN <[email protected]>
1 parent ef03d33 commit 6af0601

File tree

5 files changed

+10
-8
lines changed

5 files changed

+10
-8
lines changed

.github/workflows/ci.yml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -49,7 +49,8 @@ jobs:
4949
run: |
5050
uv pip install --system gymnasium==${{ matrix.gymnasium-version }}
5151
uv pip install --system "numpy<2"
52-
# Only run for python 3.10, downgrade gym to 0.29.1, numpy<2
52+
uv pip install --system "ale-py==0.10.1"
53+
# Only run for python 3.10, downgrade gym to 0.29.1, numpy<2, ale-py==0.10.1
5354
if: matrix.gymnasium-version != '1.0.0'
5455
- name: Lint with ruff
5556
run: |

docs/guide/examples.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -71,7 +71,7 @@ In the following example, we will train, save and load a DQN model on the Lunar
7171
7272
7373
# Create environment
74-
env = gym.make("LunarLander-v2", render_mode="rgb_array")
74+
env = gym.make("LunarLander-v3", render_mode="rgb_array")
7575
7676
# Instantiate the agent
7777
model = DQN("MlpPolicy", env, verbose=1)
@@ -289,7 +289,7 @@ If your callback returns False, training is aborted early.
289289
os.makedirs(log_dir, exist_ok=True)
290290
291291
# Create and wrap the environment
292-
env = gym.make("LunarLanderContinuous-v2")
292+
env = gym.make("LunarLanderContinuous-v3")
293293
env = Monitor(env, log_dir)
294294
295295
# Add some action noise for exploration
@@ -816,7 +816,7 @@ Bonus: Make a GIF of a Trained Agent
816816
817817
from stable_baselines3 import A2C
818818
819-
model = A2C("MlpPolicy", "LunarLander-v2").learn(100_000)
819+
model = A2C("MlpPolicy", "LunarLander-v3").learn(100_000)
820820
821821
images = []
822822
obs = model.env.reset()

docs/guide/integrations.rst

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -73,11 +73,11 @@ Installation
7373
7474
# Download model and save it into the logs/ folder
7575
# Only use TRUST_REMOTE_CODE=True with HF models that can be trusted (here the SB3 organization)
76-
TRUST_REMOTE_CODE=True python -m rl_zoo3.load_from_hub --algo a2c --env LunarLander-v2 -orga sb3 -f logs/
76+
TRUST_REMOTE_CODE=True python -m rl_zoo3.load_from_hub --algo a2c --env LunarLander-v3 -orga sb3 -f logs/
7777
# Test the agent
78-
python -m rl_zoo3.enjoy --algo a2c --env LunarLander-v2 -f logs/
78+
python -m rl_zoo3.enjoy --algo a2c --env LunarLander-v3 -f logs/
7979
# Push model, config and hyperparameters to the hub
80-
python -m rl_zoo3.push_to_hub --algo a2c --env LunarLander-v2 -f logs/ -orga sb3 -m "Initial commit"
80+
python -m rl_zoo3.push_to_hub --algo a2c --env LunarLander-v3 -f logs/ -orga sb3 -m "Initial commit"
8181
8282
8383

docs/misc/changelog.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ Documentation:
3737
^^^^^^^^^^^^^^
3838
- Clarify ``evaluate_policy`` documentation
3939
- Added doc about training exceeding the `total_timesteps` parameter
40+
- Updated ``LunarLander`` and ``LunarLanderContinuous`` environment versions to v3 (@j0m0k0)
4041

4142

4243
Release 2.6.0 (2025-03-24)

stable_baselines3/common/policies.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -381,7 +381,7 @@ def predict(
381381
# Remove batch dimension if needed
382382
if not vectorized_env:
383383
assert isinstance(actions, np.ndarray)
384-
actions = actions.squeeze(axis=0)
384+
actions = actions.squeeze(axis=0) # type: ignore[assignment]
385385

386386
return actions, state # type: ignore[return-value]
387387

0 commit comments

Comments
 (0)