You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question regarding the method reset_game in Base_Agent. The first few lines read:
def reset_game(self):Calling seed
"""Resets the game information so we are ready to play a new episode"""
self.environment.seed(self.config.seed)
self.state = self.environment.reset()
I am concerned about the seeding. If I understand correctly, 'reset_game' is called any time an episode is completed.
Assume we implement the seed method in our environment like this:
This is actually the method used in Bit_Flipping_Environment.
If we were to actually use self.np_random for resetting the environment, we would always see the same initial state over and over again. At least that is the behaviour I appear to be experiencing.
The environments implemented in this repository, e.g. Bit_Flipping_Environment, seems to circumvent this issue by not using self.np_random at all. Instead, the random module is used. In fact, I don't quite understand why np_random is a member of Bit_Flipping_Environment at all.
Correct me if I'm wrong, but doesn't this make the use of seeds completely pointless (because random is not seeded)?
I would have expected the environment seed method exactly once per run. Calling it once per episode simply doesn't make sense to me.
Please correct me if I misunderstood anything but this doesn't seem right to me.
Best,
Markus
The text was updated successfully, but these errors were encountered:
I have a question regarding the method
reset_game
in Base_Agent. The first few lines read:I am concerned about the seeding. If I understand correctly, 'reset_game' is called any time an episode is completed.
Assume we implement the seed method in our environment like this:
This is actually the method used in
Bit_Flipping_Environment
.If we were to actually use
self.np_random
for resetting the environment, we would always see the same initial state over and over again. At least that is the behaviour I appear to be experiencing.The environments implemented in this repository, e.g.
Bit_Flipping_Environment
, seems to circumvent this issue by not usingself.np_random
at all. Instead, therandom
module is used. In fact, I don't quite understand whynp_random
is a member ofBit_Flipping_Environment
at all.Correct me if I'm wrong, but doesn't this make the use of seeds completely pointless (because
random
is not seeded)?I would have expected the environment
seed
method exactly once per run. Calling it once per episode simply doesn't make sense to me.Please correct me if I misunderstood anything but this doesn't seem right to me.
Best,
Markus
The text was updated successfully, but these errors were encountered: