Skip to content

Commit 2479717

Browse files
Nicoshevfacebook-github-bot
authored andcommitted
Vectorize f16 conversion (#4253)
Summary: Pull Request resolved: #4253 X-link: facebookresearch/FBGEMM#1330 Use C++ language extension type '__fp16' to let clang vectorize float16 conversion, on aarch64 compilation. 'float16 to float' will now be vectorized 'float to float16' gets improved before (higher is better): M elements_per_ns_ref elements_per_ns_simd 1 0.0302 0.0242 10 0.205 0.231 32 0.328 0.579 40 0.354 0.697 129 0.442 1.87 256 0.466 1.92 1024 0.488 2.38 8000 0.494 2.55 after: M elements_per_ns_ref elements_per_ns_simd 1 0.0307 0.0253 10 0.301 0.256 32 0.898 0.793 40 1.12 0.967 129 2.76 2.59 256 4.09 3.7 1024 5.67 5.47 8000 6.3 6.31 Reviewed By: MatzeB Differential Revision: D75926678 fbshipit-source-id: 1f42c219c9ea67d854fb0296aa3ee07f1d174bc3
1 parent 792e68a commit 2479717

File tree

2 files changed

+2
-6
lines changed

2 files changed

+2
-6
lines changed

src/FbgemmFloat16Convert.cc

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -43,10 +43,6 @@ void FloatToFloat16_simd(
4343
FloatToFloat16_avx512(src, dst, size, do_clip);
4444
} else if (fbgemmHasAvx2Support()) {
4545
FloatToFloat16_avx2(src, dst, size, do_clip);
46-
#ifdef __aarch64__
47-
} else if (fbgemmHasArmSve2Support()) {
48-
FloatToFloat16_sve2(src, dst, size, do_clip);
49-
#endif
5046
} else {
5147
FloatToFloat16_ref(src, dst, size, do_clip);
5248
return;

src/RefImplementations.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -91,11 +91,11 @@ void FloatToFloat16_ref(
9191
if (do_clip) {
9292
for (size_t i = 0; i < size; i++) {
9393
float cur_src = std::max(-FP16_MAX, std::min(src[i], FP16_MAX));
94-
dst[i] = cpu_float2half_rn(cur_src);
94+
dst[i] = cpu_float2half(cur_src);
9595
}
9696
} else {
9797
for (size_t i = 0; i < size; i++) {
98-
dst[i] = cpu_float2half_rn(src[i]);
98+
dst[i] = cpu_float2half(src[i]);
9999
}
100100
}
101101
}

0 commit comments

Comments
 (0)