Skip to content

Commit 567dfb5

Browse files
Fomenko, Evarist Mtprimak
authored andcommitted
cpu: reorder: tentatively turn ref direct copy code for gcc
Rationale: jitted code is typically faster than reference code compiled with old GCC (4.8.3). However jitted code requires significant creation time, so if someone always creates reorders prior to its execution jitted code might become slower than simple reference code. This commit is tentative. Intel MKL-DNN team needs to find out a way to make jitting less expensive... especially for such auxiliary and quite popular stuff like direct copy and other reorders. (cherry picked from commit 44b09b8)
1 parent 830a100 commit 567dfb5

File tree

1 file changed

+9
-1
lines changed

1 file changed

+9
-1
lines changed

src/cpu/cpu_reorder.cpp

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,9 +50,17 @@ static const rpd_create_f cpu_reorder_impl_list[] = {
5050
wino_reorder_t<f32, f32>::pd_t::create,
5151
wino_reorder_t<f32, s8>::pd_t::create,
5252

53+
#if defined(__INTEL_COMPILER) || (defined(__GNUC__) && !defined(__clang__))
54+
/* Direct copy for icc which is faster than jitted code;
55+
* Direct copy for gcc which might or might not be faster than jitted
56+
* code, but still worth it because doesn't require jitting, i.e. much
57+
* faster creation time. This is tentative solution and should be removed
58+
* later (when we will cache jitted code?...). */
59+
REG_SR_DIRECT_COPY(f32, f32),
60+
#endif
61+
5362
#ifdef __INTEL_COMPILER
5463
/* direct copy for icc, which is faster than jitted code */
55-
REG_SR_DIRECT_COPY(f32, f32),
5664
REG_SR_DIRECT_COPY(f32, s32),
5765
REG_SR_DIRECT_COPY(f32, s8),
5866
REG_SR_DIRECT_COPY(f32, u8),

0 commit comments

Comments
 (0)