You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Sep 2, 2024. It is now read-only.
Thank you for your fast reply.
Have you ever thought of applying DFP to actor-critic algorithms???
Now, I am considering it to expand it to enable it to work at continuous
action space.
If it is possible, I guess it would work with DDPG algorithm.
During the research, I have used gradients of f over actions to
update/train actor network.
However, I am not sure if it is working.... ( from my implementation, it is
not working.... )
Could you give me some advice??
Wonchul Kim
2017-09-05 16:32 GMT+09:00 dosovits <[email protected]>:
We also quickly tried putting DFP in actor-critic, and it didn't work amazingly well. We only looked very briefly at this, though, and don't have a very good understanding of what might be going on.
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
When I read the paper, they say that it works at discrete action space.
Is it also possible at continuous action space???
The text was updated successfully, but these errors were encountered: