Skip to content

[quantize_pt2e] Support batched statistics #3576

@daniil-lyakhov

Description

@daniil-lyakhov

🚀 Feature request

External quantizers could mark a multiply node as quantizable (see #3558). NNCF is hardcoded the weight nodes via the weight_port_id parameter in metatypies, so the logic of the quantization is leaked to the metatypes.

The task is to figure out a way to correctly identify weight of a node without hardcoded parameter. For example, NNCF should identify which kind of multiply is present: activations x activations, weights x activations or activations x weights
The batched statistics should be enabled and tested for a custom quantizers (like x86)

Feature Use Case

Quantization with quantize_pt2e + custom quantizers with batch size > 1

Are you going to submit a PR?

  • Yes I'd like to help by submitting a PR!

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions