-
Couldn't load subscription status.
- Fork 29
Open
Description
I'm compiling a simple kernel using peano. Manually software pipelining the attached kernel (dut_pipelined.cc) yields considerable speedup compared to using pipelining pragmas (dut_pragma.cc). Without manual pipelining, the produced assembly does not pipeline and the kernel runs in ~1800 cycles. With manual pipelining, the kernel runs in ~1000 cycles. The clang loop min_iteration_count and max_iteration_count pragmas have no effect on the produced assembly.
dut_pragma.cc
dut_pipelined.cc
jamestcl-amd
Metadata
Metadata
Assignees
Labels
No labels