Releases: Xilinx/brevitas
Release v0.12.0
Breaking Changes
- TruncIntQuant, TruncAvgPool, Trunc QONNX Op changes #1042
Highlights
- New PTQ algorithms:
- New datatype support
- Hierarchical scales #1038
- Initial
torch.compile
support #1206- User guide here
- YAML-based experiments #1116
- Benchmarking scripts for LLM example #1166
- New operator support
- Better SDPA quantization support #1090
What's Changed
- Feat (examples/generative): block-based optimization for GPTQ by @Giuseppe5 in #1046
- Fix (learned_round): disable return QuantTensor during float inference by @pablomlago in #1059
- Bump onnx from 1.15 to 1.17.0 in /requirements by @dependabot in #1069
- Fix (minifloat): correct minifloat computation and tests by @Giuseppe5 in #1067
- Feat (ptq): adding accumulator-aware extensions to GPxQ by @i-colbert in #1060
- Feat: add contributing guidelines by @Giuseppe5 in #1075
- Feat (float): adding new attributes to proxy and quant tensor by @i-colbert in #1072
- Feat (accelerate): improved accelerate compatibility by @Giuseppe5 in #1065
- Fix Transformers tests by @Giuseppe5 in #1081
- Fix (data): updating wikitext2 data utility by @i-colbert in #1080
- Fix (groupwise): correct log, groupdim, and scale computation by @Giuseppe5 in #1071
- Test (mx): add reference impl for MXFloat by @Giuseppe5 in #1068
- Fix (examples/generative): Fixed argument order for
quantize_model
by @nickfraser in #1084 - Feat (export): qonnx minifloat export by @Giuseppe5 in #1070
- Feat (core): use runtime parameter for scale by @Giuseppe5 in #1037
- Fix (per_group): fixing the per_group sym quantizer by @i-colbert in #1089
- Rotation based equalization by @Giuseppe5 in #1061
- Fix (examples/llm): fix for main and README by @Giuseppe5 in #1092
- Fix: correct output scale compute by @Giuseppe5 in #1077
- Fix (ptq/rotation): fix for rotation implementation (#1095) by @Giuseppe5 in #1095
- Fix (scaling)!: clamp to avoid inf/nan in forward/backward by @Giuseppe5 in #1097
- Setup: bump python & torch version by @Giuseppe5 in #1098
- Feat: Per-Row po2 float ocp by @Giuseppe5 in #1102
- Fix LLM tests by @pablomlago in #1088
- Feat (brevitas_examples/llm): remove dependencies from optimum-amd by @Giuseppe5 in #1094
- Feat auto round by @pablomlago in #1064
- Fix (hadamard): remove hadamard loading warning by @Giuseppe5 in #1108
- Hierarchical scales by @Giuseppe5 in #1038
- Improvements to learned round by @Giuseppe5 in #1107
- Feat (brevitas_examples/llm): update README by @Giuseppe5 in #1109
- Fix (gpxq): tensor unpacking and Cholesky stabilization by @i-colbert in #1111
- Feat (llm): adding more quantizers by @i-colbert in #1113
- Feat (llm/learned_round): fast block update by @Giuseppe5 in #1110
- Fix SignSGD docstring by @pablomlago in #1115
- Feat (nn/sdpa): quantization of scaled dot-product attention by @nickfraser in #1090
- Fix (brevitas_examples/llm): scaling_min_val for fp32 by @Giuseppe5 in #1117
- Feat (scaling): no tracked_parameter_list with individual quantizer by @Giuseppe5 in #1112
- Feat (brevitas_examples/llm): select act_eq alpha by @Giuseppe5 in #1121
- Fix llm tests transformers by @pablomlago in #1118
- Fix (float/clamp): Bugfix when unsigned by @nickfraser in #1132
- Feat (brevitas_examples/llm): inference_mode support by @Giuseppe5 in #1129
- Feat (brevitas_examples/llm): correct scale init with CPU offloading by @Giuseppe5 in #1124
- Feat (brevitas_examples/sdxl): inference_mode + compile by @Giuseppe5 in #1133
- Feat (proxy): flag to enable/disable QT return by @Giuseppe5 in #1083
- Feat (examples/llm): Specify experiments via YAML files by @nickfraser in #1116
- test (core/float): Enhanced testing of minifloat formats by @nickfraser in #1136
- Eval harness by @Giuseppe5 in #1131
- Fix: pytree warning by @i-colbert in #1144
- Fix LLM entry point by @i-colbert in #1145
- Fix (scaling/standalone): better switch from runtime stats to param by @Giuseppe5 in #1099
- Fix (proxy): fix groupwise scale/zp caching by @Giuseppe5 in #1137
- Fix (export/inference_mode): correct rounding function by @Giuseppe5 in #1146
- Setup: pin transformers version by @Giuseppe5 in #1150
- Feat (mx): unpadding during dequantization by @Giuseppe5 in #1134
- Feat (brevitas_examples/llm): load from checkpoint by @Giuseppe5 in #1151
- Feat (rotation): equalize across SDPA by @Giuseppe5 in #1149
- Feat (quantization): torch_function based quantization by @Giuseppe5 in #1147
- Setup: bump torch version for LLM tests by @Giuseppe5 in #1154
- Feat (equalize): enable parametrized rotations by @pablomlago in #1148
- Feat (optim): add Cailey SGD optimizer by @pablomlago in #1153
- Setup: update pre-commit python version by @Giuseppe5 in #1158
- Fix (brevitas_examples/llm): remove unecessary checkpointing by @Giuseppe5 in #1161
- Feat (zero_point): dynamic groupwise zero point by @Giuseppe5 in #1160
- New rotation by @Giuseppe5 in #1159
- Fix (brevitas_examples/llm): equalized module + fx compatibility by @Giuseppe5 in #1164
- Fix (runtime_act): fix negative group_dim handling by @Giuseppe5 in #1157
- Fix (a2q): missing restrict_pre_scaling_impl definition by @Giuseppe5 in #1167
- Feat (equalize): enable rotation matrix optimization by @pablomlago in #1155
- Add FP16 support to ptq_evaluate.py and update README argument list by @hkayann in #1174
- Feat (brevitas_examples/llm): separate KV Cache quantization by @Giuseppe5 in #1165
- Feat (hadamard): support region expansion by @Giuseppe5 in #1178
- Feat (llm): benchmark for llm entrypoint by @pablomlago in #1166
- fix (docs/faq): remove reference to gitter, switch affine quantization to be an example by @nickfraser in #1183
- Fix (brevitas_examples/sdxl): correct import for inference_mode by @Giuseppe5 in #1185
- Feat (gpfq): optimizing with lower diagonal matrix formulation by @i-colbert in #1172
- Feat (brevitas_examples/llm): better dtype selection by @Giuseppe5 in #1186
- Fix (brevitas_examples/sdxl): faster sdxl inference by @Giuseppe5 in #1188
- fix (examples/benchmark): Fix when
run_results.yaml
does not exist. by @nickfraser in #1189 - Feat (example/common): Added groupwise, float scaled OCP option by @nickfraser in #1190
- Fix (examples/llm): default dtype from None to float16 by @pablomlago in #1191
- Fix (utils/torch_utils): ensure gradient propagation through pad_to_dim by @pablomlago in #1194
- Fix (examples/llm): prevent layernorm_to_rmsnorm option when fused_no_fx by @pablomlago in #1192
- Feat (brevitas_examples/sdxl): update mlperf by @Giuseppe5 in https://github.com/Xilinx/bre...
Release v0.11.0
Breaking Changes
- Remove ONNX QOp export (#917)
- QuantTensor cannot have empty metadata fields (e.g., scale, bitwidth, etc.) (#819)
- Bias quantization now requires the specification of bit-width (#839)
- QuantLayers do not expose quant_metadata directly. This is delegated to the proxies (#883)
- QuantDropout has been removed (#861)
- QuantMaxPool has been removed (#858)
Highlights
-
Support for OCP/FNUZ FP8 quantization
- Compatibility with QAT/PTQ, including all current PTQ algorithms implemented (GPTQ, LearnedRound, GPFQ, etc.)
- Possibility to fully customize the minifloat configuration (i.e., select mantissa/exponent bit-width, exponent bias, etc.)
- Support for ONNX QDQ export
-
Support for OCP MX Quantization
- Compatibility with QAT/PTQ, including all current PTQ algorithms implemented (GPTQ, LearnedRound, GPFQ, etc.)
- Possibility to fully customize the minifloat configuration (i.e., select mantissa/exponent bit-width, exponent bias, group size, etc.)
-
New QuantTensor supports:
- FloatQuantTensor: supports OCP FP formats and general minifloat quantization
- GroupwiseQuantTensor: supports for OCP MX formats and general groupwise int/minifloat quantization
-
Support for Channel splitting
-
Support for HQO optimization for zero point
-
Support for HQO optimization for scale (prototype)
-
Improved SDXL entrypoint under brevitas_examples
-
Improved LLM entrypoint under brevitas_examples
- Compatibility with accelerate
-
Prototype support for
torch.compile
:- Check PR #1006 for an example on how to use it
What's Changed
For a more comprehensive list of changes and fix, check the list below:
- Enhance: Importing quantized models after bias correction by @costigt-dev in #868
- Fix QCDQDecoupledWeightQuantProxyHandlerMixin return args by @costigt-dev in #870
- Fix - Speech to text: Create an empty json file by @costigt-dev in #871
- Feat (scaling/standalone): flag to retrieve full state dict by @Giuseppe5 in #874
- Notebooks: makes notebooks deterministic and prints output of asserts by @fabianandresgrob in #847
- Fix (proxy): revert value tracer change by @Giuseppe5 in #888
- Fix (proxy): fix for attributes retrieval by @Giuseppe5 in #880
- Feat (notebook): add example for dynamic quantization to ONNX export by @fabianandresgrob in #877
- Fix (gpxq): handling empty tensors with GPxQ and adding unit tests by @i-colbert in #892
- Fix (ptq): expose uint_sym_act flag and fix issue with minifloat sign by @fabianandresgrob in #898
- Feat (minifloat): add support for user specified minifloat format by @fabianandresgrob in #821
- Feat: Add QuantConv3d and QuantConv3dTranspose by @costigt-dev in #805
- Add tutorial examples of per-channel quantization by @OscarSavolainenDR in #867
- Fix (tests): revert pytest pin by @Giuseppe5 in #903
- Remove: Remove original_cat workaround by @costigt-dev in #902
- Infra: Update issue template by @nickfraser in #893
- Pull Request Template by @capnramses in #885
- Fix (core): add return in state_dict by @Giuseppe5 in #910
- Fix (quant_tensor): fix typing and remove unused checks by @Giuseppe5 in #913
- Fix (nn): removed unused caching in adaptive avgpool2d by @Giuseppe5 in #911
- Fix (quant_tensor): remove unused checks by @Giuseppe5 in #918
- Setup: pin ONNX to 1.15 due to ORT incompatibility by @Giuseppe5 in #924
- Feat (examples): add support for Stable Diffusion XL by @Giuseppe5 in #909
- Assert all ptq-common bit widths are positive integers by @OscarSavolainenDR in #931
- Enhance: Quant Tensor Test by @costigt-dev in #894
- Fix (examples/stable_diffusion): README formatting and clarification by @Giuseppe5 in #932
- Fix (examples/ptq): fix for bitwidth check by @Giuseppe5 in #934
- Feat: functionalize QuantTensor by @Giuseppe5 in #878
- Feat (minifloat): cleanup minifloat impl by @Giuseppe5 in #922
- Fix tests in dev by @Giuseppe5 in #939
- Feat (proxy): scale computation delegated to bias proxy by @Giuseppe5 in #938
- Fix (gpxq): adding input quant to process input by @i-colbert in #943
- Fix (quant): propagate device and dtype in subinjector by @Giuseppe5 in #942
- Fix (gpxq): correct variable name by @Giuseppe5 in #944
- Fix (quant_tensor): fix AvgPool functional implementation by @Giuseppe5 in #945
- Feat (quant_tensor): support for dim() and ndim by @Giuseppe5 in #947
- Fix (graph/standardize): correct check for Mean to AvgPool by @Giuseppe5 in #948
- Feat (graph/standardize): default keepdim value by @Giuseppe5 in #950
- Fix bullet formatting in getting started guide by @timkpaine in #952
- Fix (quant/float): correct scaling_impl and float_scaling_impl by @Giuseppe5 in #953
- Fix/remove-numel - Remove numel is zero check from context manager exit method by @costigt-dev in #920
- Feat (examples/ptq): support for dynamic act quant by @Giuseppe5 in #935
- Feat (quant_tensor): support for FloatQuantTensor by @Giuseppe5 in #919
- Fix (examples/llm): Add all rewriters to the list by @nickfraser in #956
- Fix (core/quant/float): use eps to avoid log(0) by @Giuseppe5 in #957
- Fix (test/actions): Excluded
torch==1.9.1
,platform=macos-latest
tests by @nickfraser in #960 - Adding FP8 weight export by @costigt-dev in #907
- Fix (llm): fix device issue for eval when not using default device by @fabianandresgrob in #949
- Fix (GPFQ): using random projection for speed up/less memory usage by @fabianandresgrob in #964
- Fix (calibrate/minifloat): fix for act calibration by @Giuseppe5 in #966
- Fix (quant/float): restore fix for log(0) by @Giuseppe5 in #968
- Setup: pin numpy version by @Giuseppe5 in #974
- Feat (minifloat): support for FNUZ variants by @Giuseppe5 in #973
- Fix (core/float): add default for float_scaling_impl by @Giuseppe5 in #972
- Feat (graph/equalize): upcast during equalization computation by @Giuseppe5 in #970
- Generative improv by @Giuseppe5 in #965
- Fix (requirements/setuptools): Set maximum requirement for
setuptools
by @nickfraser in #963 - Fix: Typo fix on SDXL command line args by @nickfraser in #976
- Fix (graph/bias_correction): Fix when layer parameters are offloaded to
accelerate
by @nickfraser in #962 - Fix (ptq/bias_correction): remove unnecessary forward pass by @Giuseppe5 in #980
- Fix (export/qonnx): Fixed symbolic kwargs order. by @nickfraser in #988
- Various SDXL quantization fixes by @nickfraser in #977
- Fix (brevitas_examples/sdxl): Various fixes by @Giuseppe5 in #991
- Feat (proxy/parameter_quant): cache quant weights by @Giuseppe5 in #990
- Docs: Added 0.10.3 release note to README. by @nickfraser in #993
- Added some preliminary unit tests to the CNNs 'quantize_model' by @OscarSavolainenDR in #927
- Feat (tests): extended minifloat unit tests by @alexredd99 in #979
- Fix (proxy/runtime_quant): correct handling of mixed type quantization by @Giuseppe5 in #985
- docs (readme): Fixed GH actions badges by @nickfraser in #996
- Feat: Update LLM entry-point ...
Release v0.10.3
What's Changed
- Backport: Fix (export/qonnx): Fixed symbolic kwargs order. (#988) by @nickfraser in #992
numpy
version,onnx
version and maximumsetuptools
version set
Full Changelog: v0.10.2...v0.10.3
Release v0.10.2
What's Changed
- Fix (QuantLayer): make bias for QuantLayer optional by @fabianandresgrob in #846
- Fix (examples/llm): set
group_size
only for groupwise quantization by @nickfraser in #853 - Fix (gpfq): updating input processing and L1-norm constraints for GPFA2Q by @i-colbert in #852
- ImageNet PTQ example fix by @Giuseppe5 in #863
- feat (gen/quantize): Added device flag to
quantize_model
by @nickfraser in #860 - Docs: update README for 0.10.2 release by @Giuseppe5 in #865
Full Changelog: v0.10.1...v0.10.2
Release v0.10.1
Highlights
- A2Q+ support paper
- A2Q+ examples with CIFAR10 and Super Resolution
- Support for concatenation equalization for weights and activations
- Support for GPFQ + A2Q L1 Norm bound
- Possibility to explicitly export Q node for weights in QCDQ export
- Support for float16 and bfloat16 for QCDQ export
- Support for Dynamic Activation Quantization for ONNX QDQ export
- Support for channel-splitting paper
- (Beta) Better compatibility with Huggingface accelerate and optimum
- (Beta) Improved support and testing for minifloat quantization
What's Changed
- Fix (examples/generative): set weight_bit_width in weight_quant by @Giuseppe5 in #783
- Feat (graph/equalize): improvements for llm equalization by @Giuseppe5 in #784
- [graph] Fix typo in class name by @nickfraser in #765
- Fix (graph/equalize): refactor for act equalization by @Giuseppe5 in #787
- [quant_tensor] Updates
__truediv__
behaviour to match "standard fixed point rules" by @nickfraser in #769 - Feat (export): (b)float16 support for qcdq export by @Giuseppe5 in #776
- Feat (ptq): Adding A2Q Upper Bound clipping to GPFQ by @fabianandresgrob in #734
- Extended equalization by @Giuseppe5 in #778
- Better Bfloat16 support by @Giuseppe5 in #777
- Fix (stats): add return statement in state_dict by @Giuseppe5 in #792
- Fix (equalize): improved cat eq checks by @Giuseppe5 in #793
- Fix (export): add CastMixin by @Giuseppe5 in #794
- Dynamic Act Quant support by @Giuseppe5 in #796
- Fix (examples/quantizers): correct dynamic zero point handling by @Giuseppe5 in #806
- Feat (a2q+): improving accumulator-aware weight quantization by @i-colbert in #797
- Feat (a2q+): adding new super resolution models to brevitas_examples by @i-colbert in #811
- Feat (Channel-Splitting): sets up first skeleton for channel-splitting by @fabianandresgrob in #772
- Feat: support for optimum by @Giuseppe5 in #826
- Fix (tests): adding tests for FloatQuant by @fabianandresgrob in #815
- Fix (export): correct q node export by @Giuseppe5 in #829
- Fix (examples/llm): correct groupwise export by @Giuseppe5 in #832
- Fix (examples/super_res): updating README by @i-colbert in #828
- Fix (examples/export): improved export by @Giuseppe5 in #838
- Fix (graph/equalize): cleanup and device management by @Giuseppe5 in #840
- Feat (examples/a2q): adding CIFAR10 example by @i-colbert in #813
- Fix (export): check for Per Group quantization by @Giuseppe5 in #848
Full Changelog: v0.10.0...v0.10.1
A2Q+ CIFAR10 model release
This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q) on an image classification task. Code is also provided to demonstrate Euclidean projection-based weight initialization (EP-init) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization".
Find the associated docs at https://github.com/Xilinx/brevitas/tree/a2q_cifar10_r1/src/brevitas_examples/imagenet_classification/a2q.
A2Q+ model release
A2Q+ Super Resolution Experiments with Brevitas
This release contains training code and pre-trained weights to demonstrate accumulator-aware quantization (A2Q+) as proposed in our paper "A2Q+: Improving Accumulator-Aware Weight Quantization" on a super resolution task.
Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r2/src/brevitas_examples/super_resolution.
Release v0.10.0
Highlights
- Support for PyTorch up to version 2.1 .
- Support for GPTQ PTQ algorithm.
- Support for GPFQ PTQ algorithm.
- Support for SmoothQuant / activation equalization PTQ algorithm.
- Support for MSE based scale and zero-point for weights and activations.
- Support for row-wise scaling at the input of QuantLinear.
- Support for quantization of a slice of a weight tensor.
- End-to-end support for learned rounding in ImageNet PTQ.
- End-to-end example training scripts for A2Q (low precision accumulation) over superresolution.
- Experimental support for minifloats (eXmY quantization).
- Experimental LLM PTQ flow with support for weight-only and weight+activation quantization, together with GPTQ, AWQ and SmoothQuant.
- Experimental Stable Diffusion PTQ flow with support for weight-only quantization.
- Deprecated FINN ONNX export flow.
- Update custom value_trace FX tracer to latest FX.
- New custom variant of make_fx tracer with support for custom torch.library ops through @Wrap annotation.
What's Changed
- Feat (nn): cache modules that require subtensor slicing by @volcacius in #628
- Feat: support slicing for gptq by @Giuseppe5 in #626
- Feat: add support to row wise input quantization to QuantLinear by @volcacius in #625
- Fix (nn): disable weight tensor slicing syntax by @volcacius in #633
- Feat (core): add SliceTensor util for sub-weight quant by @volcacius in #634
- Fix (core): add missing dtype and device by @Giuseppe5 in #635
- Feat (ptq): activation equalization support by @Giuseppe5 in #541
- Feat (fx): value_trace improvements by @volcacius in #636
- Fix (core/utils): jit ignore eager mode tensor slicing impl by @volcacius in #637
- Fix (weight_eq): fix for llm equalization by @Giuseppe5 in #638
- Add missing license by @Giuseppe5 in #640
- Feat (ptq): act equalization support for vision by @Giuseppe5 in #643
- Fix (tracer): support for index and no-tracer ops by @Giuseppe5 in #644
- Setup: pin version of inflect for compatibility by @Giuseppe5 in #647
- Activation eq extension by @Giuseppe5 in #642
- Fix (core): correct forward in ParameterFromStatsFromParameter by @Giuseppe5 in #650
- Feat (zero_point): grid search for mse zp by @Giuseppe5 in #651
- Fix (weight_eq): correct handling of layernorm/batchnorm as sink by @Giuseppe5 in #646
- Feat (nn): set dim names in QuantMHA Linear by @volcacius in #629
- Fix (act_quant): flag to enable/disable stats collection by @Giuseppe5 in #641
- Feat (core): add keepdim to min/max/percentile stats by @volcacius in #657
- Fix (ptq): conflicts between gptq and equalization by @volcacius in #656
- Fix (nn): state_dict load for unpacked in_proj in MHA by @volcacius in #654
- Feat (ptq): learned round support in evaluate/benchmark by @Giuseppe5 in #639
- Feat (nn): avoid computing output scale/zp when not needed by @volcacius in #655
- Fix (QuantTensor): pixel_shuffle and unshuffle handler by @volcacius in #663
- Setup: fix installation of libgomp1 by @Giuseppe5 in #662
- Fix (quantize): fix and improvements for fx quantize by @Giuseppe5 in #661
- Fix (resnet18): fixing default weight quantizer for linear layer by @i-colbert in #660
- Fix(gptq): fix for quant convtranspose1d/2d and conv1d by @Giuseppe5 in #665
- Refactor of ptq_common by @Giuseppe5 in #649
- Examples: initial support for LLMs PTQ by @volcacius in #658
- Fix (weight_eq): mantain order of regions by @Giuseppe5 in #667
- Feat (core): simplify binary_sign impl by @volcacius in #672
- Feat (core): add permute_dims to all reshape fns by @volcacius in #671
- Feat (graph/equalize): clean up scale invariant ops by @volcacius in #669
- Misc: fix pre-commit by @volcacius in #676
- Misc: fix another pre-commit by @volcacius in #677
- Feat (examples/llm): initial support for loading AWQ results by @volcacius in #673
- Fix (espcn): updating links to use new tags by @i-colbert in #678
- Fix (ptq): fix for act quantizers by @Giuseppe5 in #675
- Fix (ptq): fix for residual with mha by @Giuseppe5 in #681
- Fix (fx): fix fx quantize for conv->bn by @Giuseppe5 in #680
- Feat (gptq): add option to return output from forward by @Giuseppe5 in #684
- Fix (a2q): correcting post-rounding scaling initialization by @i-colbert in #659
- Feat (quant): initial support for fp8 variants by @volcacius in #686
- Fix (gptq): fix for depthwise act_order by @Giuseppe5 in #688
- Feat (core): support for stochastic round by @volcacius in #689
- Fix (gptq): Caching quant_inp values for quant_weight by @i-colbert in #653
- Feat (gptq): support for groupwise conv by @Giuseppe5 in #690
- Fix (gptq): typo in variable name by @Giuseppe5 in #691
- Rename brevitas quant custom op by @jinchen62 in #693
- Change tolerance for fp16 by @jinchen62 in #694
- Fix (docs): Updating references to A2Q paper by @i-colbert in #698
- Feat (examples/llm): add first/last layer support by @volcacius in #699
- Feat (examples/llm): add packed 3/5/6b export by @volcacius in #700
- Fix (examples/llm): padding for packed 3/5/6b by @volcacius in #701
- Fix (gptq): linalg import fix by @Giuseppe5 in #705
- Examples (a2q): updating and extending ESPCN demo by @i-colbert in #706
- Examples (a2q): adding links for pretrained models by @i-colbert in #707
- Fix (nn): add missing support for padding_mode by @volcacius in #709
- Feat (examples/llm): add custom float support by @volcacius in #708
- GPFQ by @Giuseppe5 in #666
- Feat (ptq): support for float bias by @Giuseppe5 in #713
- Feat (ptq): flag to disable/enable signed activations by @Giuseppe5 in #714
- Support for minifloat benchmark by @Giuseppe5 in #712
- adding quant_format, mantissa, and exponent options to evaluate script by @fabianandresgrob in #717
- Fix (fx): import backport on 2.1 by @volcacius in #732
- Fix (ptq): correct bitwidth for layerwise int benchmark by @Giuseppe5 in #737
- Fix (ptq): fix for ptq_common by @Giuseppe5 in #739
- Fix (examples): adding bias_quant to final linear layer in resnet18 by @i-colbert in #720
- Fix (base): Updating A2Q defaults by @i-colbert in #718
- Fix (core): arithmetic of zero-point with positive only values by @volcacius in #670
- Fix (nn): QuantConv group calculation by @i-colbert in #703
- Feat (QuantTensor): QuantTensor x Tensor elementary ops dequantize to Tensor by @volcacius in #668
- Feat (examples): initial Stable Diffusion support by @volcacius in #715
- changes class_implementation to init_class in gpxq_mode by @fabianandresgrob in #754
- Fix errors in test by @Giuseppe5 in #716
- Fix (notebook): increase atol for asserts by @Giuseppe5 in #759
- Gpfq/act order by @fabianandresgrob in #729
- Fix (backport): op decomp in make_fx backport by @volcacius in #763
- Feat (export): deprecate FINN ONNX export by @Giuseppe5 in https://github.com/Xilinx/brevitas/p...
A2Q model release
Integer-Quantized Super Resolution Experiments with Brevitas
This release contains scripts demonstrating how to train integer-quantized super resolution models using Brevitas.
Code is also provided to demonstrate accumulator-aware quantization (A2Q) as proposed in our ICCV 2023 paper "A2Q: Accumulator-Aware Quantization with Guaranteed Overflow Avoidance".
Find the associated docs at https://github.com/Xilinx/brevitas/tree/super_res_r1/src/brevitas_examples/super_resolution .
Release v0.9.1
What's Changed
- Setup: add requirements-dev with pre-commit by @Giuseppe5 in #581
- CI update by @Giuseppe5 in #570
- Fix (brevitas_examples/bnn_pynq): missing 4b resnet18 link and hash fn by @volcacius in #583
- Docs: update READMEs by @Giuseppe5 in #584
Full Changelog: v0.9.0...v0.9.1