Skip to content

Commit 9a530d7

Browse files
authored
Merge branch 'master' into fix/close-method-memory-leak
2 parents f9c53e4 + 8fccf7f commit 9a530d7

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

63 files changed

+578
-537
lines changed

.github/workflows/ci.yml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -20,17 +20,17 @@ jobs:
2020
runs-on: ubuntu-latest
2121
strategy:
2222
matrix:
23-
python-version: ["3.9", "3.10", "3.11", "3.12"]
23+
python-version: ["3.10", "3.11", "3.12", "3.13"]
2424
include:
2525
# Default version
2626
- gymnasium-version: "1.0.0"
2727
# Add a new config to test gym<1.0
2828
- python-version: "3.10"
2929
gymnasium-version: "0.29.1"
3030
steps:
31-
- uses: actions/checkout@v3
31+
- uses: actions/checkout@v6
3232
- name: Set up Python ${{ matrix.python-version }}
33-
uses: actions/setup-python@v4
33+
uses: actions/setup-python@v6
3434
with:
3535
python-version: ${{ matrix.python-version }}
3636
- name: Install dependencies
@@ -40,7 +40,8 @@ jobs:
4040
pip install uv
4141
# cpu version of pytorch
4242
# See https://github.com/astral-sh/uv/issues/1497
43-
uv pip install --system torch==2.3.1+cpu --index https://download.pytorch.org/whl/cpu
43+
# Need Pytorch 2.9+ for Python 3.13
44+
uv pip install --system torch==2.9.1+cpu --index https://download.pytorch.org/whl/cpu
4445
4546
uv pip install --system .[extra,tests,docs]
4647
# Use headless version

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ It provides a minimal number of features compared to SB3 but can be much faster
103103
**Note:** Stable-Baselines3 supports PyTorch >= 2.3
104104

105105
### Prerequisites
106-
Stable Baselines3 requires Python 3.9+.
106+
Stable Baselines3 requires Python 3.10+.
107107

108108
#### Windows
109109

docs/guide/install.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ Installation
77
Prerequisites
88
-------------
99

10-
Stable-Baselines3 requires python 3.9+ and PyTorch >= 2.3
10+
Stable-Baselines3 requires python 3.10+ and PyTorch >= 2.3
1111

1212
Windows
1313
~~~~~~~

docs/misc/changelog.rst

Lines changed: 50 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,23 +3,22 @@
33
Changelog
44
==========
55

6-
Release 2.7.1a3 (WIP)
6+
7+
Release 2.8.0a2 (WIP)
78
--------------------------
89

910
Breaking Changes:
1011
^^^^^^^^^^^^^^^^^
12+
- Removed support for Python 3.9, please upgrade to Python >= 3.10
13+
- Set ``strict=True`` for every call to ``zip(...)``
1114

1215
New Features:
1316
^^^^^^^^^^^^^
14-
- ``RolloutBuffer`` and ``DictRolloutBuffer`` now uses the actual observation / action space ``dtype`` (instead of float32), this should save memory (@Trenza1ore)
17+
- Added official support for Python 3.13
1518

1619
Bug Fixes:
1720
^^^^^^^^^^
18-
- Fixed env checker to properly handle ``Sequence`` observation spaces when nested inside composite spaces (``Dict``, ``Tuple``, ``OneOf``) (@copilot)
19-
- Update env checker to warn users when using Graph space (@dhruvmalik007).
20-
- Fixed memory leak in ``VecVideoRecorder`` where ``recorded_frames`` stayed in memory due to reference in the moviepy clip (@copilot)
21-
- Remove double space in `StopTrainingOnRewardThreshold` callback message (@sea-bass)
22-
- Add close method to BaseAlgorithm to prevent memory leaks in sequential training loops (#1966)
21+
- Fixed saving and loading of Torch compiled models (using ``th.compile()``) by updating ``get_parameters()``
2322

2423
`SB3-Contrib`_
2524
^^^^^^^^^^^^^^
@@ -32,9 +31,51 @@ Bug Fixes:
3231

3332
Deprecations:
3433
^^^^^^^^^^^^^
34+
- ``zip_strict()`` is not needed anymore since Python 3.10, please use ``zip(..., strict=True)`` instead
3535

3636
Others:
3737
^^^^^^^
38+
- Updated to Python 3.10+ annotations
39+
- Removed some unused variables (@unexploredtest)
40+
- Improved type hints for distributions
41+
- Simplified zip file loading by removing Python 3.6 workaround and enabling ``weights_only=True`` (PyTorch 2.x)
42+
- Sped up saving/loading tests
43+
44+
Documentation:
45+
^^^^^^^^^^^^^^
46+
47+
48+
Release 2.7.1 (2025-12-05)
49+
--------------------------
50+
51+
.. warning::
52+
53+
Stable-Baselines3 (SB3) v2.7.1 will be the last one supporting Python 3.9 (end of life in October 2025).
54+
We highly recommended you to upgrade to Python >= 3.10.
55+
56+
57+
Breaking Changes:
58+
^^^^^^^^^^^^^^^^^
59+
60+
New Features:
61+
^^^^^^^^^^^^^
62+
- ``RolloutBuffer`` and ``DictRolloutBuffer`` now uses the actual observation / action space ``dtype`` (instead of float32), this should save memory (@Trenza1ore)
63+
64+
Bug Fixes:
65+
^^^^^^^^^^
66+
- Fixed env checker to properly handle ``Sequence`` observation spaces when nested inside composite spaces (``Dict``, ``Tuple``, ``OneOf``) (@copilot)
67+
- Update env checker to warn users when using Graph space (@dhruvmalik007).
68+
- Fixed memory leak in ``VecVideoRecorder`` where ``recorded_frames`` stayed in memory due to reference in the moviepy clip (@copilot)
69+
- Remove double space in `StopTrainingOnRewardThreshold` callback message (@sea-bass)
70+
- Add close method to BaseAlgorithm to prevent memory leaks in sequential training loops (#1966)
71+
72+
`SB3-Contrib`_
73+
^^^^^^^^^^^^^^
74+
- Fixed tensorboard log name for ``MaskablePPO``
75+
76+
`SBX`_ (SB3 + Jax)
77+
^^^^^^^^^^^^^^^^^^
78+
- Added ``CnnPolicy`` to PPO
3879

3980
Documentation:
4081
^^^^^^^^^^^^^^
@@ -47,7 +88,7 @@ Documentation:
4788
- Updated link to paper of community project DeepNetSlice (@AlexPasqua)
4889
- Added example usage of Tensorflow JS
4990
- Included exact versions in ONNX JS and example project
50-
- Made step 2 (`pip install`) of `CONTRIBUTING.md` more robust
91+
- Made step 2 (`pip install`) of `CONTRIBUTING.md` more robust
5192

5293

5394
Release 2.7.0 (2025-07-25)
@@ -1904,4 +1945,4 @@ And all the contributors:
19041945
@DavyMorgan @luizapozzobon @Bonifatius94 @theSquaredError @harveybellini @DavyMorgan @FieteO @jonasreiher @npit @WeberSamuel @troiganto
19051946
@lutogniew @lbergmann1 @lukashass @BertrandDecoster @pseudo-rnd-thoughts @stefanbschneider @kyle-he @PatrickHelm @corentinlger
19061947
@marekm4 @stagoverflow @rushitnshah @markscsmith @NickLucche @cschindlbeck @peteole @jak3122 @will-maclean
1907-
@brn-dev @jmacglashan @kplers @MarcDcls @chrisgao99 @pstahlhofen @akanto @Trenza1ore @JonathanColetti
1948+
@brn-dev @jmacglashan @kplers @MarcDcls @chrisgao99 @pstahlhofen @akanto @Trenza1ore @JonathanColetti @unexploredtest

pyproject.toml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
[tool.ruff]
22
# Same as Black.
33
line-length = 127
4-
# Assume Python 3.9
5-
target-version = "py39"
4+
# Assume Python 3.10
5+
target-version = "py310"
66

77
[tool.ruff.lint]
88
# See https://beta.ruff.rs/docs/rules/

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -135,7 +135,7 @@
135135
long_description=long_description,
136136
long_description_content_type="text/markdown",
137137
version=__version__,
138-
python_requires=">=3.9",
138+
python_requires=">=3.10",
139139
# PyPI package information.
140140
project_urls={
141141
"Code": "https://github.com/DLR-RM/stable-baselines3",
@@ -147,10 +147,10 @@
147147
},
148148
classifiers=[
149149
"Programming Language :: Python :: 3",
150-
"Programming Language :: Python :: 3.9",
151150
"Programming Language :: Python :: 3.10",
152151
"Programming Language :: Python :: 3.11",
153152
"Programming Language :: Python :: 3.12",
153+
"Programming Language :: Python :: 3.13",
154154
],
155155
)
156156

stable_baselines3/a2c/a2c.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
from typing import Any, ClassVar, Optional, TypeVar, Union
1+
from typing import Any, ClassVar, TypeVar
22

33
import torch as th
44
from gymnasium import spaces
@@ -65,9 +65,9 @@ class A2C(OnPolicyAlgorithm):
6565

6666
def __init__(
6767
self,
68-
policy: Union[str, type[ActorCriticPolicy]],
69-
env: Union[GymEnv, str],
70-
learning_rate: Union[float, Schedule] = 7e-4,
68+
policy: str | type[ActorCriticPolicy],
69+
env: GymEnv | str,
70+
learning_rate: float | Schedule = 7e-4,
7171
n_steps: int = 5,
7272
gamma: float = 0.99,
7373
gae_lambda: float = 1.0,
@@ -78,15 +78,15 @@ def __init__(
7878
use_rms_prop: bool = True,
7979
use_sde: bool = False,
8080
sde_sample_freq: int = -1,
81-
rollout_buffer_class: Optional[type[RolloutBuffer]] = None,
82-
rollout_buffer_kwargs: Optional[dict[str, Any]] = None,
81+
rollout_buffer_class: type[RolloutBuffer] | None = None,
82+
rollout_buffer_kwargs: dict[str, Any] | None = None,
8383
normalize_advantage: bool = False,
8484
stats_window_size: int = 100,
85-
tensorboard_log: Optional[str] = None,
86-
policy_kwargs: Optional[dict[str, Any]] = None,
85+
tensorboard_log: str | None = None,
86+
policy_kwargs: dict[str, Any] | None = None,
8787
verbose: int = 0,
88-
seed: Optional[int] = None,
89-
device: Union[th.device, str] = "auto",
88+
seed: int | None = None,
89+
device: th.device | str = "auto",
9090
_init_setup_model: bool = True,
9191
):
9292
super().__init__(

0 commit comments

Comments
 (0)