-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
🚀 Feature
ReplayBufferSamples, RolloutBufferSamples, DictReplayBufferSamples, and DictRolloutBufferSamples are currently NamedTuples. As such, subclassing them is not supported. Converting them to dataclasses would support subclassing.
Motivation
Some RL algorithms require additional fields from the replay/rollout buffer.
An example is action masking. SB3 contrib implements MaskableRolloutBufferSamples but has to type ignore in several methods as a workaround for the unsupported subclassing of NamedTuples (see here). If RolloutBufferSamples were a dataclass, such a workaround would not be necessary, and the solution would be cleaner.
There are many other examples in novel research methods. This refactor would improve modularity and make it easier to implement new algorithms (e.g., in SB3 contrib).
Pitch
ReplayBufferSamples, RolloutBufferSamples, DictReplayBufferSamples, and DictRolloutBufferSamples should be refactored to be data classes.
Alternatives
Nothing comes to my mind other than dataclasses.
Additional context
No response
Checklist
- I have checked that there is no similar issue in the repo
- If I'm requesting a new feature, I have proposed alternatives