We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
1 parent 3678702 commit 2f21f02Copy full SHA for 2f21f02
_posts/2025-02-24-ptpc-fp8-rocm.md
@@ -64,7 +64,7 @@ The illustration shows two quantization approaches:
64
65
**Scaling Factors:**
66
- **Top (Per-Tensor)**: Single scalars ΔX[1] and ΔW[1] for entire tensors
67
-- **Bottom (PTPC)**: Vector ΔX[T×1] with one scale per token and ΔW[1×Co] with one scale per output channel
+- **Bottom (PTPC)**: Vector ΔX[T×1] with one scale per token and ΔW[1×Co] with one scale per input channel
68
69
This granular scaling approach allows PTPC-FP8 to achieve accuracy close to BF16 while maintaining the speed and memory benefits of 8-bit computation.
70
0 commit comments