You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
See #1520, where some details of PyTorch's logsigmoid grad implementation are discussed and supported.
I think we can avoid the extra creation of the CUDA buffer tensor (even though it's empty) by implementing a thunder grad formula without it, then adding a special grad formula for logsigmoid when it's being executed by the PyTorch executor that has the buffer tensor.
See #1520, where some details of PyTorch's logsigmoid grad implementation are discussed and supported.
I think we can avoid the extra creation of the CUDA buffer tensor (even though it's empty) by implementing a thunder grad formula without it, then adding a special grad formula for logsigmoid when it's being executed by the PyTorch executor that has the buffer tensor.
cc @apaz-cli
The text was updated successfully, but these errors were encountered: