Skip to content

Add FlagOS backend support for Taichi#8785

Open
GWinfinity wants to merge 2 commits intotaichi-dev:masterfrom
GWinfinity:master
Open

Add FlagOS backend support for Taichi#8785
GWinfinity wants to merge 2 commits intotaichi-dev:masterfrom
GWinfinity:master

Conversation

@GWinfinity
Copy link

@GWinfinity GWinfinity commented Feb 2, 2026

This commit adds support for FlagOS (unified AI chip software stack) to Taichi, enabling Taichi programs to run on various AI chips including MLU (Cambricon), Ascend (Huawei), DCU (Hygon), and GCU (Enflame).

Major changes:

  • Add flagos architecture definition and RHI device layer
  • Implement FlagOS code generator for LLVM IR generation
  • Add FlagOS program implementation with kernel compiler and launcher
  • Update build system with TI_WITH_FLAGOS option
  • Add example programs (fractal, matmul) and documentation

New files:

  • taichi/rhi/flagos/*: RHI device implementation
  • taichi/codegen/flagos/*: Code generation layer
  • taichi/runtime/program_impls/flagos/*: Program implementation
  • examples/flagos/*: Example programs
  • docs/flagos_integration_design.md: Design documentation

Usage:
ti.init(arch=ti.flagos, flagos_chip='mlu370')

See FLAGOS_CHANGES.md for detailed change log.

Issue: #

Brief Summary

copilot:summary

Walkthrough

copilot:walkthrough


Note

High Risk
High risk because it introduces a new architecture/backend path through core program initialization, codegen, and device memory/kernel launch plumbing; many operations are still stubbed (TI_NOT_IMPLEMENTED) and could fail at runtime if exercised.

Overview
Adds a new Arch::flagos backend end-to-end: arch registration and LLVM-usage wiring, a flagos_chip compile option exposed to Python, and Program selection of FlagosProgramImpl when built with TI_WITH_FLAGOS.

Introduces new FlagOS-specific modules for RHI/device, LLVM task/kernel codegen, and kernel compilation/launch scaffolding (with placeholders for FlagOS SDK/FlagTree integration), and wires them into the build via TI_WITH_FLAGOS CMake options and new subdirectories.

Adds FlagOS documentation and example scripts (examples/flagos) demonstrating ti.init(arch=ti.flagos, flagos_chip=...) and basic benchmarks.

Written by Cursor Bugbot for commit 6cbc8b2. This will update automatically on new commits. Configure here.

This commit adds support for FlagOS (unified AI chip software stack) to Taichi,
enabling Taichi programs to run on various AI chips including MLU (Cambricon),
Ascend (Huawei), DCU (Hygon), and GCU (Enflame).

Major changes:
- Add flagos architecture definition and RHI device layer
- Implement FlagOS code generator for LLVM IR generation
- Add FlagOS program implementation with kernel compiler and launcher
- Update build system with TI_WITH_FLAGOS option
- Add example programs (fractal, matmul) and documentation

New files:
- taichi/rhi/flagos/*: RHI device implementation
- taichi/codegen/flagos/*: Code generation layer
- taichi/runtime/program_impls/flagos/*: Program implementation
- examples/flagos/*: Example programs
- docs/flagos_integration_design.md: Design documentation

Usage:
  ti.init(arch=ti.flagos, flagos_chip='mlu370')

See FLAGOS_CHANGES.md for detailed change log.
@CLAassistant
Copy link

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

auto thread_idx =
builder->CreateIntrinsic(Intrinsic::nvvm_read_ptx_sreg_tid_x, {}, {});
auto block_dim =
builder->CreateIntrinsic(Intrinsic::nvvm_read_ptx_sreg_ntid_x, {}, {});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVIDIA PTX intrinsics used for non-NVIDIA chips

High Severity

The get_spmd_info() function uses NVIDIA-specific intrinsics (nvvm_read_ptx_sreg_tid_x and nvvm_read_ptx_sreg_ntid_x) to get thread and block information. FlagOS targets non-NVIDIA AI chips (MLU, Ascend, DCU, GCU), where these NVIDIA PTX intrinsics are invalid and will produce incorrect code or compilation failures.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants