Fix issue with missing upcast/downcast for bf16 libdevice calls. by zoranjovanovic-ns · Pull Request #661 · ROCm/xla

zoranjovanovic-ns · 2026-03-09T20:27:31Z

Motivation

Introduced missing upcast/downcast for bf16 type

Technical Details

upcast/downcast are necessary because there is no native bf16 implementation in libdevice

Test Plan

triton_xla_math_to_libdevice.mlir

Test Result

Test pass

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

claude · 2026-03-09T20:32:03Z

xla/backends/gpu/codegen/triton/transforms/triton_xla_math_to_libdevice.cc

    if (res.getType() != output_type ||
-        (output_type.isF16() &&
-         !HasF16Implementation(OpInfo<OpTy>::kFunctionID, triple_))) {
+        (output_type.isBF16() ||
+         (output_type.isF16() &&
+         !HasF16Implementation(OpInfo<OpTy>::kFunctionID, triple_)))) {


nit: fragile downcast condition — The output_type.isBF16() disjunction here can be true even when res.getType() == output_type. In practice this doesn't fire because bf16 always enters the upcast path above, so res will have type f32 and res.getType() != output_type is already sufficient. But the condition is fragile: if someone later changes the upcast logic without updating this downcast logic, the isBF16() arm could trigger a spurious bf16→bf16 cast.

Consider simplifying the downcast condition to just res.getType() != output_type, which is correct in all cases and doesn't duplicate the logic of the upcast block.

claude · 2026-03-09T20:32:52Z

Claude Code Review for PR 661: This PR correctly adds bf16 upcast/downcast handling alongside the existing f16 path for libdevice calls. The core logic is sound. One inline comment posted on the downcast condition (lines 216-219) suggesting simplification. Additionally, since no test file changes are included and this targets ROCm, consider adding AMDGCN bf16 test cases to triton_xla_math_to_libdevice.mlir for stronger coverage.

i-chaochen

Thanks! we don't run this UT //xla/backends/gpu/codegen/triton/transforms/tests:triton_xla_math_to_libdevice.mlir.test on our CI?

Fix issue with missing upcast/downcast for bf16 libdevice calls.

5556466

zoranjovanovic-ns requested review from i-chaochen and nurmukhametov March 9, 2026 20:27

zoranjovanovic-ns added cherry-pick-candidate Mark a PR to be cherry-picked into the next ROCm JAX. Remove IIF the latest upstream contain the PR. Upstream rocm-jaxlib-v0.8.2 labels Mar 9, 2026

claude bot reviewed Mar 9, 2026

View reviewed changes

i-chaochen approved these changes Mar 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue with missing upcast/downcast for bf16 libdevice calls.#661

Fix issue with missing upcast/downcast for bf16 libdevice calls.#661
zoranjovanovic-ns wants to merge 1 commit intorocm-jaxlib-v0.8.2from
rocm-jaxlib-v0.8.2-math_to_libdevice

zoranjovanovic-ns commented Mar 9, 2026

Uh oh!

claude bot Mar 9, 2026

Uh oh!

claude bot commented Mar 9, 2026

Uh oh!

i-chaochen left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zoranjovanovic-ns commented Mar 9, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

claude bot Mar 9, 2026

Choose a reason for hiding this comment

Uh oh!

claude bot commented Mar 9, 2026

Uh oh!

i-chaochen left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants