-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NO _loss_tracker on train_on_batch because compile model multiple times. Possible Bug. #20474
Comments
any update on this bug ? |
Hi @TheMGGdev and @mohammad-rababah - Here getting the error because you are doing So when trying to compile combine model That's why You can compile discriminator after the combine model compile. It will resolve your error.
Attached gist for your reference. |
The error has nothing to do with that. There are trainings in which a model that is a combination of several models, you don't want to train one of them, as in this example with the GANS. Here you have generator, discriminator and combined (which is generator + discriminator). When you create the combined model, which is the one you are going to use to train the generator, you want the discriminator not to train, so you put discriminator.trainable = False, Explained more simply:
The code works perfectly in other versions and it comes from a example of this repo https://github.com/eriklindernoren/Keras-GAN/blob/master/cgan/cgan.py. The real problem is that the new versions of Keras when you do the combined, it clears all the trainer metrics and therefore when you do the combined compile it deletes the discriminator metrics. As explained in the following pull request #20473 . Put the
after the combined compile instead of before does not solve the problem but creates another one, because now in the generator training the generator and the discriminator will be trained. Hope it helps :) |
@TheMGGdev looks like we just need a unit test to go along with your change #20473 |
Same code of a GAN works perfectly in Keras 3.3 but doesn´t work in keras 3.6. There is an error on train_on_batch I believe is because an bug introduce in a change in Keras 3.6
The code is this:
The error says that self._loss_tracker.update_state is None when is should be metrics_module.Mean(name="loss") as is been compiled.
The print that I write in the code shows that after compiled the discriminator and before the combined:
However after compiled the combined:
discriminator.compiled -> True
discriminator.optimizer -> <keras.src.optimizers.adam.Adam object at 0x77ecf6bb0a00>
discriminator.train_function -> None
discriminator.train_step -> <bound method TensorFlowTrainer.train_step of >
discriminator.metrics -> []
discriminator._loss_tracker -> None
discriminator._jit_compile -> True
So the problems realives that compiling the combined erases the metrics (loss_tracks,...) of the discriminator what it shouldn't and keeps the discriminator as compiled when it shouldn´t becouse it undo the compiled. I belive the bug relieves in a change introduce in Keras 3.6 that change the compile of keras/src/trainers/trainer.py:
The function self._clear_previous_trainer_metrics not only clears the metrics of combined but also of discriminator what that makes that discriminator not having proper metrics.
.
My pull request to this possible error is: #20473
I try the code with the thee backeends and happens always
I hope it help ! :)
The text was updated successfully, but these errors were encountered: