Skip to content

Conversation

bit-soham
Copy link

@bit-soham bit-soham commented Mar 5, 2024

Hello Sir, I noticed a small issue in the code while watching you videos hope i provided a good solution for it
Problem: If in a google colab I had Value Objects initialized and without reinitializing them I called the backward function in the a different cell twice then the new gradients calculated are added or multiplied over the previous gradients
So i am just setting the grads to 0 when we are building_topo such that the ones that are not visited before are set to 0.0

…out reinitializing them I call the backward function in the second cell twice then the new gradients calculated are added or multiplied over the previous gradients
@minh-nguyenhoang
Copy link

That's rather a feature and not a bug. Most modern deep learning frameworks actually do this because in a training loops, you may want to calculate gradient of multiple minibatches before actually optimize the parameters as you cannot compute all minibatches in a step. If you want to do that, you may want to leave to leaf nodes grad alone and only set the grad of non-leaf nodes to 0.

@bit-soham
Copy link
Author

bit-soham commented Mar 10, 2024

Thank you for your explaination. I had clearly misunderstood

That's rather a feature and not a bug. Most modern deep learning frameworks actually do this because in a training loops, you may want to calculate gradient of multiple minibatches before actually optimize the parameters as you cannot compute all minibatches in a step. If you want to do that, you may want to leave to leaf nodes grad alone and only set the grad of non-leaf nodes to 0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants