Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
smoothquant.py		smoothquant.py

README.md

SmoothQuant original conversion script

This converts an OPT or Bloom 🤗 transformers model to a "smoothed" version, as described in SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models.

$ python smoothquant.py --model facebook/opt-1.3b --save-path smoothed-models/facebook/opt-1.3b

Note: due to hard-coded assumptions on model architecture in the script this only works for OPT models that apply the layer_norm before the attention (do_layer_norm_before=true in config.json). This means all models but facebook/opt-350m.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

smoothquant

smoothquant

README.md

SmoothQuant original conversion script

Files

smoothquant

Directory actions

More options

Directory actions

More options

Latest commit

History

smoothquant

Folders and files

parent directory

README.md

SmoothQuant original conversion script