- 
          
 - 
                Notifications
    
You must be signed in to change notification settings  - Fork 95
 
Open
Description
Recently there have been papers related to policy collapse and loss of plasticity in Reinforcement Learning suggesting that the default values for the Adam betas in PyTorch (b1=0.9, b2=0.999) are not ideal and pretty much arbitrary, and I noticed that this is the case here also.
This paper suggests using b1=b2 for better results.
This is more of a discussion than an issue tho, my testing seems to agree with the paper (for reference I used b1=b2=0.9), both using my own env and using gym envs like the cartpole problem. I do not know however how relevant this is outside of RL.
Metadata
Metadata
Assignees
Labels
No labels