-
Deadlock ErrorWhen an L1I cache miss response arrives after all threads in a warp have already completed functionally, I saw this error when running multi-kernel workloads (more kernels = higher chance of hitting the race) & deadlock typically appears at kernel boundaries when all warps need to exit. Example disreptancy when all pipeline stages empty, L1I MSHR empty, response FIFO empty. Instruction is in ibuffer but will never be issued.: Race between L1I miss latency and warp completion. Sequence:
Proposed FixIn decode(), after filling ibuffer, flush it if the warp is already functional_done(). ibuffer_flush() calls dec_inst_in_pipeline() for each valid entry, keeping the counter balanced. Instructions flushed here would never have executed anyway since the scheduler skips completed warps. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
|
sounds like related to this one? #503 |
Beta Was this translation helpful? Give feedback.
sounds like related to this one? #503
Do you think this can be it?