Open
Description
I implemented the algorithm into another RL library by adopting this code. However, when I tried to use it to solve other tasks, (e.g., pendulum, cartpole-v1) with unchanged observation inputs, the performance is unbelievably bad. It seems the preprocess plays an indispensable role in this algorithm, which totally should not be the case.
I wonder if you have tried this algorithm on other generic tasks.
Metadata
Metadata
Assignees
Labels
No labels