Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[torch_xla2] Wire torch_xla2.compiled function with torch AutogradFunction #8587

Open
qihqi opened this issue Jan 17, 2025 · 0 comments
Open

Comments

@qihqi
Copy link
Collaborator

qihqi commented Jan 17, 2025

🚀 Feature

Currently if we wrap with model with torch_xla2.compile and want to train the model using the traditional torch training loop similar to https://github.com/pytorch/xla/blob/master/experimental/torch_xla2/examples/basic_training.py

You would notice that it doesn't work.

The reason is because the compile wrapper JittableModule will eventuall call a jax.jitd callable, and torch doesn't know how to compute gradient of that callable.

The solution is to create a torch.autograd.Function subclass on the fly, with backward defined to call jax.vjp similar to this tutorial: https://pytorch.org/tutorials/beginner/examples_autograd/two_layer_net_custom_function.html

The result would be that wrapping a model with torch_xla2.compile it is still trainable.

Motivation

Having the forward and backward compiled with jax jit is faster to run.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant