Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

popcount with clang #570

Open
michael-roe opened this issue May 15, 2022 · 2 comments
Open

popcount with clang #570

michael-roe opened this issue May 15, 2022 · 2 comments
Labels
Enhancement new kernel entirely or for some specific ARCH

Comments

@michael-roe
Copy link
Contributor

Modern compilers are clever; they can recognize some of the algorithms for popcount and replace them with a popcnt (Intel) or cpop (RISC-V) instruction.

int count(long x)
{
int v = 0;
while(x != 0)
{
x &= x - 1;
v++;
}
return v;
}

This will get turned into popcntq %rdi, %rax by clang (with -O3 -march=x86-64-v2)

This would suggest that in order to get good performance, VOLK's popcount implementation ought to be one of the ones that is recognized by popular compilers.

@jdemel
Copy link
Contributor

jdemel commented May 16, 2022

Thanks for the hint. Do you have a reference to read? And would you be willing to create a PR with this change?

@michael-roe
Copy link
Contributor Author

I'm thinking about creating a pull request for this, but I'm trying to get cpu_features working on non-x86 architectures first, so I can test on a CPU that has popcount but isn't x86.

@jdemel jdemel added the Enhancement new kernel entirely or for some specific ARCH label Jun 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement new kernel entirely or for some specific ARCH
Projects
None yet
Development

No branches or pull requests

2 participants