You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently it's handled at the lowest level of integer quantization, but I argue it's not very intuitive.
When a quantizer is called, it should always quantize. Then the proxy will decide whether to return the original float value (having still called the quantizer and accumulated statics/computed gradients) or the quantized value.
Pro:
More clear behaviour of the quantizer (will come in handy later when dealing with real quantization (opposite to fake quantization )
Support for DelayWrapper for all QuantDtype in one go
Proxy is still the ultimate controller over QuantTensor vs Tensor (and not the quantizer)
Cons:
Proxy might get messy in the future, keep track
Perform small, useless computation for a few iterations
The text was updated successfully, but these errors were encountered:
Hello,
Thanks for offering!
Although the general idea is relatively clear in my mind, there are some technical details about the implementation that I still need to figure out and discuss with @nickfraser. We'll keep this thread up-to-date when there are news.
Related to Xilinx#1023
Move `DelayWrapper` logic to Proxy classes.
* Add `DelayWrapper` instantiation in the `WeightQuantProxyFromInjectorBase` class in `src/brevitas/proxy/parameter_quant.py`.
* Modify the `forward` method in `WeightQuantProxyFromInjectorBase` to use `DelayWrapper` to decide the return value.
* Remove `DelayWrapper` instantiation and usage from the `IntQuant` and `DecoupledIntQuant` classes in `src/brevitas/core/quant/int_base.py`.
* Add tests in `tests/brevitas/proxy/test_proxy.py` to ensure the new behavior of `DelayWrapper` in the proxy classes.
Currently it's handled at the lowest level of integer quantization, but I argue it's not very intuitive.
When a quantizer is called, it should always quantize. Then the proxy will decide whether to return the original float value (having still called the quantizer and accumulated statics/computed gradients) or the quantized value.
Pro:
real quantization
(opposite tofake quantization
)Cons:
The text was updated successfully, but these errors were encountered: