Skip to content

Conversation

@Ginray
Copy link

@Ginray Ginray commented Nov 27, 2025

Summary

This PR is the first step in the adaptation of Ascend NPU to Liger Kernel: adding NPU device support. For details, refer to [RFC] Native Ascend NPU Support for Liger Kernel, Section 2.1: Device Support Integration.

Details

Key Modifications:

  1. Add the installation method and basic function adaptation for NPU.
  2. Directly import via triton.language.math on NPU to avoid errors caused by non-existent interfaces.

Testing Done

Verification Status:
We have conducted verification on Atlas 800T A3, and basic test cases such as test_softmax and test_swiglu have passed. We will continue to improve it in the future.
image

  • Hardware Type:
  • run make test to ensure correctness
  • run make checkstyle to ensure code style
  • run make test-convergence to ensure convergence

Comment on lines +81 to +85
try:
if hasattr(torch, "npu") and getattr(torch.npu, "amp", None) is not None:
return torch.npu.amp.custom_fwd, torch.npu.amp.custom_bwd
except Exception:
pass
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What exception could it possibly be?

Comment on lines +41 to +48
def get_npu_multi_processor_count() -> int:
"""Return a heuristic multi-processor count for NPU."""
NPU_MULTI_PROCESSOR_COUNT = 48
if is_npu_available():
return NPU_MULTI_PROCESSOR_COUNT
# Reasonable default to avoid division by zero
return 1

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is torch.npu going to support get_device_properties so we can get these numbers programmatically? If that's the case, I suggest using that method instead and leave magic number as a fallback. WDYT

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants