[BUG] Issue processing NF4 double quantization #183

ishan-modi · 2025-04-22T17:08:29Z

Describe the bug

I am trying to use NF4 real quantization and came across an error because of scales not being divisible by block size. We should add padding to the scales so that we can quantize it using block quant to mitigate this issue. This can be achieved by adding reduce_block_padding function to double_quantization function.

Suggested change

scales = reduce_block_padding(
    scales.view(-1), block_sizes={-1: scale_block_size}
)

in the following line of code

Version

nvidia_modelopt == 0.27.1

The text was updated successfully, but these errors were encountered:

meenchen · 2025-04-29T18:21:06Z

Thanks for the suggestion @ishan-modi ! I will make this update.

ishan-modi · 2025-05-20T04:12:47Z

@meenchen any updates on this ?

ishan-modi · 2025-06-08T11:30:44Z

@kevalmorabia97 @meenchen, when do we plan to add this ?

ishan-modi added the bug Something isn't working label Apr 22, 2025

kevalmorabia97 assigned meenchen Apr 22, 2025

ishan-modi mentioned this issue Apr 25, 2025

[Quantization] Add TRT-ModelOpt as a Backend huggingface/diffusers#11173

Open

ishan-modi linked a pull request Jun 8, 2025 that will close this issue

[BUG] Issue processing NF4 double quantization #209

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Issue processing NF4 double quantization #183

[BUG] Issue processing NF4 double quantization #183

ishan-modi commented Apr 22, 2025 •

edited

Loading

meenchen commented Apr 29, 2025

Uh oh!

ishan-modi commented May 20, 2025

Uh oh!

ishan-modi commented Jun 8, 2025

Uh oh!

[BUG] Issue processing NF4 double quantization #183

[BUG] Issue processing NF4 double quantization #183

Comments

ishan-modi commented Apr 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Describe the bug

Suggested change

Version

meenchen commented Apr 29, 2025

Uh oh!

ishan-modi commented May 20, 2025

Uh oh!

ishan-modi commented Jun 8, 2025

Uh oh!

ishan-modi commented Apr 22, 2025 •

edited

Loading