Skip to content

v3.1.0 volk_32fc_s32f_atan2_32f.h avx2 and avx2_fma kernels return NaN for an input element 0+0j #730

Closed
@jj1bdx

Description

@jj1bdx

Synopsis

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns -nan for the following kernels when an input element is 0+0j:

  • avx2 (a_avx2, u_avx2)
  • avx2_fma (a_avx2_fma, u_avx2_fma)

VOLK v3.1.0 volk_32fc_s32f_atan2_32f() returns zero for the following kernels when an input element is 0+0j:

  • generic
  • polynomial

(where j^2 = -1)

Tested on Ubuntu 22.04.3 x86_64.

Test details are available at: https://gist.github.com/jj1bdx/62e27aac4b54a29dfafe210c73c49b0e

What should be done

The avx2 and avx2_fma kernels should behave the same as the generic and polynomial kernels, i.e., returning zero when the input is 0+0j. I have to screen this irregularity in my own application as in this example, which affects the overall performance and introduces unnecessary complexity.

I would appreciate it if the original author of the kernel would kindly help fix this bug.

References

  • Issue new kernels for atan2 #636
  • Speeding up atan2f by 50x
    • In the Edge Cases section of this article, the author explicitly states that: "All our functions do not handle inputs containing infinity, or when both x and y are 0, while conforming implementations must handle them specially". So, this is assumed to be a known issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions