You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Include running_mean and running_val when updating target networks (#1004)
* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <[email protected]>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* include `running_mean` and `running_val` when updating target networks in DQN, SAC, TD3.
* Update stable_baselines3/common/utils.py
Co-authored-by: Antonin RAFFIN <[email protected]>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Fix `DictReplayBuffer.next_observations` type (#1013)
* Fix DictReplayBuffer.next_observations type
* Update changelog
Co-authored-by: Antonin RAFFIN <[email protected]>
* Fixed missing verbose parameter passing (#1011)
Co-authored-by: Quentin Gallouédec <[email protected]>
* Support for `device=auto` buffers and set it as default value (#1009)
* Default device is "auto" for buffer + auto device support in BufferBaseClass
* Update docstring
* Update tests
* Unify tests
* Update changelog
* Fix tests on CUDA device
Co-authored-by: Antonin RAFFIN <[email protected]>
Co-authored-by: Antonin Raffin <[email protected]>
* Precompute batch norm parameters in `_setup_model` and directly copy them in the target update.
* Update test
* Add comments and update tests
* Bump version
* Remove one extra space to conform code style.
* Update docstrings
Co-authored-by: Antonin RAFFIN <[email protected]>
Co-authored-by: Quentin Gallouédec <[email protected]>
Co-authored-by: Burak Demirbilek <[email protected]>
Co-authored-by: Antonin Raffin <[email protected]>
Copy file name to clipboardexpand all lines: docs/misc/changelog.rst
+3-2
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
Changelog
4
4
==========
5
5
6
-
Release 1.6.1a1 (WIP)
6
+
Release 1.6.1a2 (WIP)
7
7
---------------------------
8
8
9
9
Breaking Changes:
@@ -23,6 +23,7 @@ Bug Fixes:
23
23
- Fixed division by zero error when computing FPS when a small number of time has elapsed in operating systems with low-precision timers.
24
24
- Added multidimensional action space support (@qgallouedec)
25
25
- Fixed missing verbose parameter passing in the ``EvalCallback`` constructor (@burakdmb)
26
+
- Fixed the issue that when updating the target network in DQN, SAC, TD3, the ``running_mean`` and ``running_var`` properties of batch norm layers are not updated (@honglu2875)
0 commit comments