You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Modern compilers are clever; they can recognize some of the algorithms for popcount and replace them with a popcnt (Intel) or cpop (RISC-V) instruction.
int count(long x)
{
int v = 0;
while(x != 0)
{
x &= x - 1;
v++;
}
return v;
}
This will get turned into popcntq %rdi, %rax by clang (with -O3 -march=x86-64-v2)
This would suggest that in order to get good performance, VOLK's popcount implementation ought to be one of the ones that is recognized by popular compilers.
The text was updated successfully, but these errors were encountered:
I'm thinking about creating a pull request for this, but I'm trying to get cpu_features working on non-x86 architectures first, so I can test on a CPU that has popcount but isn't x86.
Modern compilers are clever; they can recognize some of the algorithms for popcount and replace them with a popcnt (Intel) or cpop (RISC-V) instruction.
int count(long x)
{
int v = 0;
while(x != 0)
{
x &= x - 1;
v++;
}
return v;
}
This will get turned into popcntq %rdi, %rax by clang (with -O3 -march=x86-64-v2)
This would suggest that in order to get good performance, VOLK's popcount implementation ought to be one of the ones that is recognized by popular compilers.
The text was updated successfully, but these errors were encountered: