You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It seems that the Triton or PyTorch implementation for efficiency experiments does not include attention bias within its attention operation.
Am I understanding this correctly? If so, I’m curious as to why attention bias is not included in these efficiency experiments.
Is it because the bias is not considered important for the experiments, or is there another reason?
Thanks.
The text was updated successfully, but these errors were encountered:
Hello,
It seems that the Triton or PyTorch implementation for efficiency experiments does not include attention bias within its attention operation.
Am I understanding this correctly? If so, I’m curious as to why attention bias is not included in these efficiency experiments.
Is it because the bias is not considered important for the experiments, or is there another reason?
Thanks.
The text was updated successfully, but these errors were encountered: