Skip to content

Commit 03d72d5

Browse files
committed
Fix missing references, update changelog
1 parent b1b247b commit 03d72d5

File tree

3 files changed

+10
-5
lines changed

3 files changed

+10
-5
lines changed

docs/misc/changelog.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ Documentation:
3838
^^^^^^^^^^^^^^
3939
- Added Decisions and Dragons to resources. (@jmacglashan)
4040
- Updated PyBullet example, now compatible with Gymnasium
41+
- Added link to policies for ``policy_kwargs`` parameter (@kplers)
4142

4243
Release 2.4.0 (2024-11-18)
4344
--------------------------
@@ -1738,4 +1739,4 @@ And all the contributors:
17381739
@DavyMorgan @luizapozzobon @Bonifatius94 @theSquaredError @harveybellini @DavyMorgan @FieteO @jonasreiher @npit @WeberSamuel @troiganto
17391740
@lutogniew @lbergmann1 @lukashass @BertrandDecoster @pseudo-rnd-thoughts @stefanbschneider @kyle-he @PatrickHelm @corentinlger
17401741
@marekm4 @stagoverflow @rushitnshah @markscsmith @NickLucche @cschindlbeck @peteole @jak3122 @will-maclean
1741-
@brn-dev @jmacglashan
1742+
@brn-dev @jmacglashan @kplers

docs/modules/a2c.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -78,7 +78,7 @@ Train a A2C agent on ``CartPole-v1`` using 4 environments.
7878

7979
A2C is meant to be run primarily on the CPU, especially when you are not using a CNN. To improve CPU utilization, try turning off the GPU and using ``SubprocVecEnv`` instead of the default ``DummyVecEnv``:
8080

81-
.. code-block::
81+
.. code-block:: python
8282
8383
from stable_baselines3 import A2C
8484
from stable_baselines3.common.env_util import make_vec_env
@@ -88,7 +88,7 @@ Train a A2C agent on ``CartPole-v1`` using 4 environments.
8888
env = make_vec_env("CartPole-v1", n_envs=8, vec_env_cls=SubprocVecEnv)
8989
model = A2C("MlpPolicy", env, device="cpu")
9090
model.learn(total_timesteps=25_000)
91-
91+
9292
For more information, see :ref:`Vectorized Environments <vec_env>`, `Issue #1245 <https://github.com/DLR-RM/stable-baselines3/issues/1245>`_ or the `Multiprocessing notebook <https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/sb3/multiprocessing_rl.ipynb>`_.
9393

9494

@@ -165,6 +165,8 @@ Parameters
165165
:inherited-members:
166166

167167

168+
.. _a2c_policies:
169+
168170
A2C Policies
169171
-------------
170172

docs/modules/ppo.rst

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ Train a PPO agent on ``CartPole-v1`` using 4 environments.
9292

9393
PPO is meant to be run primarily on the CPU, especially when you are not using a CNN. To improve CPU utilization, try turning off the GPU and using ``SubprocVecEnv`` instead of the default ``DummyVecEnv``:
9494

95-
.. code-block::
95+
.. code-block:: python
9696
9797
from stable_baselines3 import PPO
9898
from stable_baselines3.common.env_util import make_vec_env
@@ -102,7 +102,7 @@ Train a PPO agent on ``CartPole-v1`` using 4 environments.
102102
env = make_vec_env("CartPole-v1", n_envs=8, vec_env_cls=SubprocVecEnv)
103103
model = PPO("MlpPolicy", env, device="cpu")
104104
model.learn(total_timesteps=25_000)
105-
105+
106106
For more information, see :ref:`Vectorized Environments <vec_env>`, `Issue #1245 <https://github.com/DLR-RM/stable-baselines3/issues/1245#issuecomment-1435766949>`_ or the `Multiprocessing notebook <https://colab.research.google.com/github/Stable-Baselines-Team/rl-colab-notebooks/blob/sb3/multiprocessing_rl.ipynb>`_.
107107

108108
Results
@@ -178,6 +178,8 @@ Parameters
178178
:inherited-members:
179179

180180

181+
.. _ppo_policies:
182+
181183
PPO Policies
182184
-------------
183185

0 commit comments

Comments
 (0)