- 
                Notifications
    
You must be signed in to change notification settings  - Fork 74
 
          Rely on -cl-fp32-correctly-rounded-divide-sqrt for precise divide and sqrt
          #5415
        
          New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
  
    Rely on -cl-fp32-correctly-rounded-divide-sqrt for precise divide and sqrt
  
  #5415
              Conversation
…nd sqrt Signed-off-by: Whitney Tsang <[email protected]>
bc74db7    to
    7f660f5      
    Compare
  
    | 
           As a temp. workaround this OK. We will want to set the flag only on a specific operation to avoid changing precision for the entire function. @whitneywhtsang please add TODO and reference a new issue number to track the final solution.  | 
    
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add unit tests
          
 Created #5419.  | 
    
          
 There is existing unit test for precise divide and sqrt: https://github.com/intel/intel-xpu-backend-for-triton/blob/main/python/test/unit/language/test_core.py#L1015  | 
    
Signed-off-by: Whitney Tsang <[email protected]>
OpenCL supports the build option
-cl-fp32-correctly-rounded-divide-sqrt, which changes the requirement forOpFDivto be correctly rounded. The disadvantage is it is added at kernel level, so all divide and sqrt are precise, i.e., we cannot have both precise and approximate div in one kernel.