You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm observing performance regressions for bert and bart model inference with jax mainline compared to jax-v0.4.34 on both x86 and arm64 cpu platforms. The performance drop is around 50%. I have root-caused it to the following PR (and commit) that updated the minimum alignment for the buffer from 16 to 64 to match Eigen.
I have tested two AWS EC2 instances:
c7i.xlarge (x86 architecture) and
c8g.xlarge (arm64 architecture)
I'm able to restore the performance by switching the alignment back to 16.
Can you please let me know what scenarios was this change helping?
commit 7db61fe79207e457c89886048afa21484d827590
Author: Adam Banaś <[email protected]>
Date: Thu Oct 10 04:06:07 2024 -0700
[XLA:CPU] Change the minimum alignment of buffers to match Eigen
PiperOrigin-RevId: 684386915
The text was updated successfully, but these errors were encountered:
Hi @snadampal, can you provide the script you're using to reproduce?
One reason for such a significant performance drop is that additional buffer copies are performed (instead of views). This can happen if XLA receives unaligned buffers.
To answer your question, there were two reasons for integrating this change:
I'm observing performance regressions for bert and bart model inference with jax mainline compared to jax-v0.4.34 on both x86 and arm64 cpu platforms. The performance drop is around 50%. I have root-caused it to the following PR (and commit) that updated the minimum alignment for the buffer from
16
to64
to match Eigen.I have tested two AWS EC2 instances:
c7i.xlarge (x86 architecture) and
c8g.xlarge (arm64 architecture)
I'm able to restore the performance by switching the alignment back to
16
.Can you please let me know what scenarios was this change helping?
PR: #16505
commit:
The text was updated successfully, but these errors were encountered: