Skip to content

Commit 63a6adf

Browse files
committed
Updated README and removed NT flag from ICX config
1 parent e1b7225 commit 63a6adf

File tree

2 files changed

+11
-1
lines changed

2 files changed

+11
-1
lines changed

README.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -289,3 +289,13 @@ make plot_dataset
289289

290290
The script also generates a combined plot with bandwidths from all the kernels
291291
into one plot.
292+
293+
## Caveats
294+
A few known caveats to the user, based on the experience with the compilers.
295+
296+
- Intel oneAPI DPC++/C++ Compiler 2023.2.0 (icx/icpx compiler):
297+
- NonTemporal Stores (aka Streaming Stores): We leave the choice to the user whether to use NT stores or not.
298+
- If the user wants to use NT stores using `-qopt-streaming-stores=always` compiler flag, then the user has to avoid using the `-ffreestanding` compiler flag. This will not generate NT instructions, but generates calls to `__libirc_nontemporal_store@PLT` in the assembly.
299+
- For the Througput mode with OpenMP, the icx/icpx compiler does not respect the `nontemporal()` clause with the OpenMP `simd` directive.
300+
301+
It's recommended not to use NT stores if the user wants to observe cache hierarchy when using the Sequential or Throughput mode.

mk/include_ICX.mk

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -12,7 +12,7 @@ FAST_WORKAROUND = -O3 -static -fp-model=fast
1212
endif
1313

1414
VERSION = --version
15-
CFLAGS = $(FAST_WORKAROUND) -xHost -qopt-streaming-stores=always -std=c99 -Wno-unused-command-line-argument -ffreestanding $(OPENMP)
15+
CFLAGS = $(FAST_WORKAROUND) -xHost -std=c99 -Wno-unused-command-line-argument -ffreestanding $(OPENMP)
1616
LFLAGS = $(OPENMP)
1717
DEFINES = -D_GNU_SOURCE
1818
INCLUDES =

0 commit comments

Comments
 (0)