Skip to content

Conversation

@whitneywhtsang
Copy link
Contributor

OpenCL supports the build option -cl-fp32-correctly-rounded-divide-sqrt, which changes the requirement for OpFDiv to be correctly rounded. The disadvantage is it is added at kernel level, so all divide and sqrt are precise, i.e., we cannot have both precise and approximate div in one kernel.

@whitneywhtsang whitneywhtsang force-pushed the whitneywhtsang/cl-fp32-correctly-rounded-divide-sqrt branch from bc74db7 to 7f660f5 Compare November 1, 2025 04:14
@whitneywhtsang whitneywhtsang marked this pull request as ready for review November 1, 2025 14:00
@etiotto
Copy link
Contributor

etiotto commented Nov 3, 2025

As a temp. workaround this OK. We will want to set the flag only on a specific operation to avoid changing precision for the entire function. @whitneywhtsang please add TODO and reference a new issue number to track the final solution.

Copy link
Contributor

@etiotto etiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add unit tests

@whitneywhtsang
Copy link
Contributor Author

As a temp. workaround this OK. We will want to set the flag only on a specific operation to avoid changing precision for the entire function. @whitneywhtsang please add TODO and reference a new issue number to track the final solution.

Created #5419.

@whitneywhtsang
Copy link
Contributor Author

Please add unit tests

There is existing unit test for precise divide and sqrt: https://github.com/intel/intel-xpu-backend-for-triton/blob/main/python/test/unit/language/test_core.py#L1015

Signed-off-by: Whitney Tsang <[email protected]>
@whitneywhtsang whitneywhtsang enabled auto-merge (squash) November 3, 2025 16:15
@whitneywhtsang whitneywhtsang merged commit f762b2b into main Nov 3, 2025
23 checks passed
@whitneywhtsang whitneywhtsang deleted the whitneywhtsang/cl-fp32-correctly-rounded-divide-sqrt branch November 3, 2025 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants