Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better hmin/hmax algorithms for SSE/AVX2 #510

Merged
merged 2 commits into from
Nov 4, 2024

Commits on Nov 1, 2024

  1. Better hmin/hmax algorithms for SSE/AVX2

    Use a formulation that automatically produces the same result
    in all lanes, avoiding a separate broadcast step.
    
    The same approach would work with floats in principle, but it's
    not guaranteed to give the same result in all lanes when NaNs
    are involved (due to the way MINPS/MAXPS are defined), so leave
    the float versions alone for now.
    
    About 1% encode time reduction encoding a 8192x8192 test texture
    at 6x6 -thorough on a Ryzen 7950X3D.
    Fabian Giesen committed Nov 1, 2024
    Configuration menu
    Copy the full SHA
    b79248a View commit details
    Browse the repository at this point in the history

Commits on Nov 4, 2024

  1. Configuration menu
    Copy the full SHA
    55138bf View commit details
    Browse the repository at this point in the history