Changing repr in torchao to show quantized Linear #34202

MekkCyber · 2024-10-16T22:19:18Z

What does this PR do?

When a model is quantized using TorchAO and then loaded, the representation of its Linear layers is expected to be different compared to the standard representation. This pull request (PR) modifies the representation of these Linear layers to match the format used in TorchAO's implementation : https://github.com/pytorch/ao/blob/main/torchao/quantization/quant_api.py

Before :
Linear(in_features=4096, out_features=4096, bias=False)
After :

Linear(in_features=4096, out_features=4096, weight=AffineQuantizedTensor(shape=torch.Size([4096, 4096]), block_size=(1, 128), device=cuda:0, layout_type=TensorCoreTiledLayoutType(inner_k_tiles=8), layout_tensor_dtype=torch.int32, quant_min=0, quant_max=15))

Who can review?

cc @SunMarc

SunMarc

Thanks for figuring out the issue @MekkCyber ! Left a few comments

src/transformers/quantizers/quantizer_torchao.py

MekkCyber · 2024-10-18T12:46:30Z

cc @SunMarc for review ! Thank you !

src/transformers/quantizers/quantizer_torchao.py

SunMarc

LGTM ! Thanks for fixing this ! Just a nit. Also rebase the PR to fix the CI.

HuggingFaceDocBuilderDev · 2024-10-22T16:52:50Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

ArthurZucker

LGTM, quick q on perf!

ArthurZucker · 2024-10-28T12:37:55Z

src/transformers/quantizers/quantizer_torchao.py

@@ -46,6 +45,25 @@ def find_parent(model, name):
    return parent


+def _quantization_type(weight):


do we want to put this on lru cache? Or is it smart enough to be fast ?

I would think it's smart enough to be fast, but I will try to do a benchmark to test that

ArthurZucker · 2024-11-05T15:09:49Z

We can merge in the mean time 🤗

) * Changing __repr__ in torchao * small update * make style * small update * add LinearActivationQuantizedTensor * remove some cases * update imports & handle return None * update

SunMarc reviewed Oct 17, 2024

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

SunMarc reviewed Oct 18, 2024

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

SunMarc reviewed Oct 18, 2024

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Show resolved Hide resolved

SunMarc reviewed Oct 22, 2024

View reviewed changes

src/transformers/quantizers/quantizer_torchao.py Outdated Show resolved Hide resolved

SunMarc approved these changes Oct 22, 2024

View reviewed changes

MekkCyber added 8 commits October 22, 2024 16:19

Changing __repr__ in torchao

fbc187e

small update

bda8981

make style

46fefb6

small update

504cbee

add LinearActivationQuantizedTensor

649bbe3

remove some cases

f2bed3a

update imports & handle return None

2a680e7

update

f6e2f83

MekkCyber force-pushed the main branch from 4e74d0f to f6e2f83 Compare October 22, 2024 16:25

SunMarc requested review from ArthurZucker and LysandreJik October 22, 2024 16:55

ArthurZucker approved these changes Oct 28, 2024

View reviewed changes

MekkCyber merged commit d2bae7e into huggingface:main Nov 5, 2024
22 of 23 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing repr in torchao to show quantized Linear #34202

Changing repr in torchao to show quantized Linear #34202

MekkCyber commented Oct 16, 2024

SunMarc left a comment

MekkCyber commented Oct 18, 2024

SunMarc left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented Oct 22, 2024

ArthurZucker left a comment

ArthurZucker Oct 28, 2024

MekkCyber Nov 5, 2024

ArthurZucker commented Nov 5, 2024

		@@ -46,6 +45,25 @@ def find_parent(model, name):
		return parent


		def _quantization_type(weight):

Changing __repr__ in torchao to show quantized Linear #34202

Changing __repr__ in torchao to show quantized Linear #34202

Conversation

MekkCyber commented Oct 16, 2024

What does this PR do?

Who can review?

SunMarc left a comment

Choose a reason for hiding this comment

MekkCyber commented Oct 18, 2024

SunMarc left a comment • edited Loading

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented Oct 22, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

ArthurZucker Oct 28, 2024

Choose a reason for hiding this comment

MekkCyber Nov 5, 2024

Choose a reason for hiding this comment

ArthurZucker commented Nov 5, 2024

Changing repr in torchao to show quantized Linear #34202

Changing repr in torchao to show quantized Linear #34202

SunMarc left a comment •

edited

Loading