Add link to PR

araffin · araffin · commit 5f3ed27789df · 2025-08-27T12:20:22.000+02:00
diff --git a/docs/modules/a2c.rst b/docs/modules/a2c.rst
@@ -93,7 +93,7 @@ Train a A2C agent on ``CartPole-v1`` using 4 environments.
 
 .. note::
 
-  **Using gSDE (Generalized State-Dependent Exploration) during inference:**
+  Using gSDE (Generalized State-Dependent Exploration) during inference (see `PR #1767 <https://github.com/DLR-RM/stable-baselines3/pull/1767>`_):
 
   When using A2C models trained with ``use_sde=True``, the automatic noise resetting that occurs during training (controlled by ``sde_sample_freq``) does not happen when using ``model.predict()`` for inference. This results in deterministic behavior even when ``deterministic=False``.
 
diff --git a/docs/modules/ppo.rst b/docs/modules/ppo.rst
@@ -107,7 +107,7 @@ Train a PPO agent on ``CartPole-v1`` using 4 environments.
 
 .. note::
 
-  **Using gSDE (Generalized State-Dependent Exploration) during inference:**
+  Using gSDE (Generalized State-Dependent Exploration) during inference (see `PR #1767 <https://github.com/DLR-RM/stable-baselines3/pull/1767>`_):
 
   When using PPO models trained with ``use_sde=True``, the automatic noise resetting that occurs during training (controlled by ``sde_sample_freq``) does not happen when using ``model.predict()`` for inference. This results in deterministic behavior even when ``deterministic=False``.
 
diff --git a/docs/modules/sac.rst b/docs/modules/sac.rst
@@ -95,7 +95,7 @@ This example is only to demonstrate the use of the library and its functions, an
 
 .. note::
 
-  **Using gSDE (Generalized State-Dependent Exploration) during inference:**
+  Using gSDE (Generalized State-Dependent Exploration) during inference (see `PR #1767 <https://github.com/DLR-RM/stable-baselines3/pull/1767>`_):
 
   When using SAC models trained with ``use_sde=True``, the automatic noise resetting that occurs during training (controlled by ``sde_sample_freq``) does not happen when using ``model.predict()`` for inference. This results in deterministic behavior even when ``deterministic=False``.