You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I generated a game with a 10 * 10 map - pursuit. There are one predator with my own a2c model and two preys with random actor. By training, predator's total reward per episode converges to zero, never higher than zero. Does it mean predator never chooses to attack any preys? How can predator get a positive reward?
The text was updated successfully, but these errors were encountered:
u can see the reward about the predator. i remembered the predator should get the positive reward when they attacked. Meanwhile, when they surrounded the preys they can attack the prey and get positive reward. i dont know if i get the true realization. i wish this can help u.
Just one predator in the map, can this predator get a positive reward? or can the predator attack any preys if and only if there is one predator in the map?
I generated a game with a 10 * 10 map - pursuit. There are one predator with my own a2c model and two preys with random actor. By training, predator's total reward per episode converges to zero, never higher than zero. Does it mean predator never chooses to attack any preys? How can predator get a positive reward?
The text was updated successfully, but these errors were encountered: