-
Notifications
You must be signed in to change notification settings - Fork 129
Open
Description
Hey Tao,
I am trying to implement your rationale model in pytorch right now and I keep running into the problem, that after a couple iterations z becomes all one's. This obviously makes the encoder quite strong but does not do what I want.
The generator's loss function is cost(x,y,z) * logp(z|x)
. While the first term is large, logpz becomes all zeros (since the model learns to predict all ones with 100% prob and log(1) = 0). Therefore, the overall loss (and derivative) for the generator becomes zero, leading to this all one's phenomenon.
How did you address this in your code?
Metadata
Metadata
Assignees
Labels
No labels