You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+4-4
Original file line number
Diff line number
Diff line change
@@ -46,7 +46,7 @@ As shown in the figure below, there are three usage methods based on the flash_a
46
46
47
47
2. For A100, L40, hardware that supports FA v2, ring_flash_attn uses FA v2.
48
48
49
-
3. For hardware such as NPUs that does not support FA, use torch to implement attention computation. In this case, there is no need to install `flash_attn`, and you should apply `LongContextAttention(ring_impl_type="basic", attn_type=FlashAttentionImpl.TORCH)`.
49
+
3. For hardware such as NPUs that does not support FA, use torch to implement attention computation. In this case, there is no need to install `flash_attn`, and you should apply `LongContextAttention(ring_impl_type="basic", attn_type=AttnType.TORCH)`.
0 commit comments