-
Notifications
You must be signed in to change notification settings - Fork 31
Open
Labels
examplesExamples showcasing Iris APIs and usageExamples showcasing Iris APIs and usageirisIris project issueIris project issue
Description
Summary
Numerical validation failure in one-shot all-reduce GEMM implementation.
Command to Reproduce
python3 examples/09_gemm_one_shot_all_reduce/benchmark.py --num_stages 1 --validate --datatype fp32
Observed Behavior
- Validation fails on rank 1 only (rank 0 passes)
- Large numerical discrepancies in output tensor C
- Max absolute difference: 328.2
- Example mismatch: C=-33.74 vs expected=-94.16 at index (99, 3394)
Configuration
- world_size=2, M=8192, N=4608, K=36864
- BLK_M=256, BLK_N=64, BLK_K=64
- datatype=fp32, num_stages=1
- Registers: 168, Spills: 0
Metadata
Metadata
Assignees
Labels
examplesExamples showcasing Iris APIs and usageExamples showcasing Iris APIs and usageirisIris project issueIris project issue