-
Notifications
You must be signed in to change notification settings - Fork 762
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backend PyTorch: Add L1 and L1+L2 regularizers #1905
base: master
Are you sure you want to change the base?
Conversation
We are unifying the regularization for tensorflow and paddle, see #1894 . Do you think we can also unify pytorch regularization in a more unified code? |
Unfortunately, this can be a bit difficult in pytorch. The implementation options seem unnecessarily complicated to me. |
) | ||
|
||
def train_step(inputs, targets, auxiliary_vars): | ||
def closure(): | ||
losses = outputs_losses_train(inputs, targets, auxiliary_vars)[1] | ||
total_loss = torch.sum(losses) | ||
if l1_factor: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For l1+l2 regularization, this might not be the correct way. weight_decay
in the optimizers is implemented not as the L2 loss function. We should only consider either L1 or L2, not L1 + L2 now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I didn't notice earlier that you implemented the NysNewtonCG optimizer. Should I add L1 regularization in |
Continuation of the PR #1884