[WIP] Add Support for msamp FP4 Quantization and Unit Test #203

Mr-Philo · 2025-05-22T06:34:44Z

Work in Progress, Please donot Merge

This PR is for new features regarding to FP4 quantization for MS-AMP library.

Working items:

Custom FP4 quant function in CUDA
Custom Differentiable Gradient Estimation for weight update
Forward and Backward pass for simulated FP4 quantization
Unit Test for FP4 quantization

yzygitzh · 2025-07-10T08:08:27Z

msamp/operators/fp4_quant/setup.py

+        )
+    ],
+    cmdclass={'build_ext': cpp_extension.BuildExtension}
+)


Please add newline at the end of the file

The same for other files

yzygitzh · 2025-07-10T08:08:56Z

msamp/operators/fp4_quant/quant.cu

+    __nv_bfloat16* output_data = reinterpret_cast<__nv_bfloat16*>(output.data_ptr<at::BFloat16>()); 
+
+    const int threadsPerBlock = HIP_GET_NUM_THREADS(size);              // 512
+    // const int blocks = HIP_GET_BLOCKS(size, threadsPerBlock);        // max grid num: HIP_MAX_GRID_NUM = 65535


Please remove all unused code like this

yzygitzh · 2025-07-10T08:22:09Z

msamp/common/dtype/floating.py

@@ -15,24 +15,29 @@ class Floating:
    qfp_max: dict = {}

    @staticmethod
-    def _get_fp_max(exp, man, inf_existed=True):
+    def _get_fp_max(exp, man, inf_existed=True, nan_existed=True):


Seems we don't need to revise this file, because parameter nan_existed is never set to False in new code

yzygitzh · 2025-07-10T08:22:54Z

msamp/megatron/layers.py

@@ -13,6 +13,14 @@
 from msamp.common.tensor import ScalingTensor
 from msamp.operators.gemm import Gemm

+import os
+
+USE_W_SIMU_FP4 = bool(int(os.getenv('USE_W_SIMU_FP4', 0)))


Shall we make the naming clearer? Like using WEIGHT and ACTIVATION instead of W and A, and use SIMULATION/SIMULATE instead of SIMU. Also we need to add MSAMP_ prefix otherwise it's not bounded in MS-AMP project.

yzygitzh · 2025-07-10T08:31:53Z

Makefile

@@ -24,4 +24,5 @@ lint: cpplint mdlint
 postinstall:
 	cd msamp/operators/dist_op && bash build.sh && cd -
 	cd msamp/operators/arithmetic && pip install -v . && cd -
+	cd msamp/operators/fp4_quant && pip install -v . && cd -


Please just use quantize or quantization for naming.

yzygitzh · 2025-07-10T08:34:02Z

msamp/megatron/layers.py

@@ -175,6 +196,9 @@ def backward(ctx, grad_output):
            wgrad_qtype,
            use_split_accumulator=True,
        )
+        if USE_W_DIFFERENTIABLE_GRADIENT_ESTIMATOR:
+            scaled_w = ctx.saved_tensors[0]
+            grad_weight.mul_(FP4_QUANT.apply_DGE_item(scaled_w, k=5.0, power_clamp_max=3.0))


Let's just eliminate explicit argument assignments for k and power_clamp_max here, if their default values are already provided.

yzygitzh · 2025-07-10T08:34:45Z

msamp/operators/fp4_quant/fp4_quant.py

+class FP4_QUANT:
+    """FP4 Quantization operator."""
+    @staticmethod
+    def apply_DGE_item(


What does DGE mean here?

Please add full term in function comment

yzygitzh · 2025-07-10T08:40:02Z

msamp/operators/fp4_quant/fp4_quant.py

+
+
+    @staticmethod
+    def quantize_simu_fp4_in_bf16(


simulate or simulation instead of simu?

yzygitzh · 2025-07-11T03:18:54Z

msamp/operators/fp4_quantize/quantize.cu

Please add MIT license head

add support for msamp fp4 quantization and unit test

d0361f1

Mr-Philo mentioned this pull request Jul 3, 2025

Add Example Training Script in FP4 Format Azure/MS-AMP-Examples#26

Open

yzygitzh reviewed Jul 10, 2025

View reviewed changes

Mr-Philo added 5 commits July 11, 2025 02:51

Rename fp4_quant to fp4_quantize

70dd974

revert change for msamp/common/dtype/floating.py

19406e1

Rename FP4 based env variables

526b016

Rename quantize_simulate_fp4_in_bf16 function

4d45ea1

Remove unused lines and add end of new line

32d686b

yzygitzh reviewed Jul 11, 2025

View reviewed changes

Mr-Philo added 3 commits July 11, 2025 03:25

add MIT License

eb3825a

Add doc for paper link

6faa426

update spelling

ecc9fe4

[WIP] Add Support for msamp FP4 Quantization and Unit Test #203

Are you sure you want to change the base?

[WIP] Add Support for msamp FP4 Quantization and Unit Test #203

Conversation

Mr-Philo commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yzygitzh Jul 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Mr-Philo commented May 22, 2025 •

edited

Loading

yzygitzh Jul 10, 2025 •

edited

Loading