[Feature Request] Simplify the logic of sampling from rollout_buffer

### 🚀 Feature

In sb3_contrib/common/recurrent/buffers.py/RecurrentRolloutBuffer()._get_samples(), I think you adopt a rather weird way to get the samples from the rollout_buffer. You split one fixed-length training sequence (len=batch_size) into two variable data, and then pad them so that both of them could provided simultaneously to the LSTM. The padding may introduce some extra non-sense noise, and make the debugging quite hard. I'm working on designing a new LSTM but spend a whole night in understanding the logic of sampling in your code. I have a more intuitive and popular way to do this: just randomly pick some fixed-length sequences from the rollout buffer, and don't split any sampled sequence more. 

### Motivation

_No response_

### Pitch

_No response_

### Alternatives

_No response_

### Additional context

_No response_

### Checklist

- [x] I have checked that there is no similar [issue](https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/issues) in the repo
- [x] If I'm requesting a new feature, I have proposed alternatives

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature Request] Simplify the logic of sampling from rollout_buffer #303

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature Request] Simplify the logic of sampling from rollout_buffer #303

Description

🚀 Feature

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions