Alternative approach to support torch.compile #1006

Giuseppe5 · 2024-08-23T06:50:59Z

This works by assuming that most of the quantization process has already taken place, and it is no longer needed to propagate QuantTensors.

Typical usage:

with inference_mode(model):
     model(input)
     compile_model = torch.compile(model, fullgraph=True)
     # Rest of the computation under compile goes here. Once out of the context manager, the original model must be used.

src/brevitas/proxy/runtime_quant.py

nickfraser

This needs tests!

src/brevitas/core/function_wrapper/clamp.py

nickfraser · 2024-09-12T10:24:28Z

src/brevitas/export/inference/handler.py

+            self.max_clamp = max_int(module.is_signed, module.is_narrow_range, self.bit_width)
+
+    def quantize(self, x):
+        return torch.clamp(


It looks like these won't work with Groupwise quantization, correct? So inference_mode + MX won't work?

You're right, I forgot to add the export handler for MX INT and MX Float

Postponed to another update

src/brevitas/proxy/float_runtime_quant.py

tests/brevitas_end_to_end/test_torchvision_models.py

nickfraser · 2024-09-19T06:13:51Z

LGTM!

Giuseppe5 commented Aug 23, 2024

View reviewed changes

src/brevitas/proxy/runtime_quant.py Outdated Show resolved Hide resolved

Giuseppe5 force-pushed the alternative_compile branch from d9a29b1 to 73dc0cf Compare August 25, 2024 12:53

nickfraser requested changes Sep 5, 2024

View reviewed changes

Giuseppe5 force-pushed the alternative_compile branch from b67534b to 4fd473a Compare September 7, 2024 10:05

Giuseppe5 added 2 commits September 7, 2024 11:15

Compile

eb9b9b4

Inference handler

d765704

Giuseppe5 force-pushed the alternative_compile branch from 4fd473a to d765704 Compare September 7, 2024 10:20

Giuseppe5 added 17 commits September 9, 2024 17:08

update

a73e578

precommit

db899ae

fix

f2cdb41

fix test

726933e

cleanup

f2c22ab

fix inf mask

12a4283

fix float tests

1e4a89b

restore apply_input_view

f392e0e

Fix API export

1c638c0

small test, temp

8299eb9

even smaller tests

d8cab12

Cleanup

f5446a8

missing file

d50a311

fix super

b4d744d

correct weight handling

d4ce339

parallel tests end_to_end

3ca7505

Fixes

9d82357

nickfraser reviewed Sep 12, 2024

View reviewed changes

Giuseppe5 added 2 commits September 13, 2024 13:13

fix

ea42506

precommit fix

3dc7dfa

Giuseppe5 requested a review from nickfraser September 13, 2024 14:07

Giuseppe5 added 2 commits September 13, 2024 16:22

Fix import

04231e4

tests fix

762e613

Giuseppe5 added 2 commits September 13, 2024 19:00

Added compile test with tolerance

6b49188

Fix test structure, hopefully faster

24d84c1

Giuseppe5 requested review from nickfraser and removed request for nickfraser September 17, 2024 01:38

Giuseppe5 merged commit b28ac0f into Xilinx:dev Sep 23, 2024
373 of 374 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Alternative approach to support torch.compile #1006

Alternative approach to support torch.compile #1006

Giuseppe5 commented Aug 23, 2024 •

edited

Loading

nickfraser left a comment

nickfraser Sep 12, 2024

Giuseppe5 Sep 12, 2024

Giuseppe5 Sep 13, 2024

nickfraser commented Sep 19, 2024

Alternative approach to support torch.compile #1006

Alternative approach to support torch.compile #1006

Conversation

Giuseppe5 commented Aug 23, 2024 • edited Loading

nickfraser left a comment

Choose a reason for hiding this comment

nickfraser Sep 12, 2024

Choose a reason for hiding this comment

Giuseppe5 Sep 12, 2024

Choose a reason for hiding this comment

Giuseppe5 Sep 13, 2024

Choose a reason for hiding this comment

nickfraser commented Sep 19, 2024

Giuseppe5 commented Aug 23, 2024 •

edited

Loading