You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Would it make sense to suppress the warning because if we run this quantization script (and we use the recommended pathway to save the model with model.save_pretrained(...)):
fromllmcompressor.modifiers.quantizationimportQuantizationModifierfromllmcompressorimportoneshotfromtransformersimportAutoModelForCausalLM# First we load the modelmodel_stub="mistralai/Mistral-Small-24B-Instruct-2501"model=AutoModelForCausalLM.from_pretrained(model_stub)
# Then we specify the quantization reciperecipe=QuantizationModifier(
targets="Linear",
scheme="FP8_dynamic",
ignore=["lm_head"],
)
# Then we apply quantizationoneshot(model=model, recipe=recipe)
# Finally, we save the quantized model to disksave_path=model_stub+"-FP8-dynamic"model.save_pretrained(save_path, skip_compression_stats=True, disable_sparse_compression=True)
we would be greeted with stdout that contains this warning Optimized model is not saved. To save, please provide which could be a bit confusing:
Loading checkpoint shards: 100%|███████████████████████████████████| 10/10 [00:05<00:00, 1.81it/s]
2025-04-18T10:00:46.447596+0000 | reset | INFO - Compression lifecycle reset
2025-04-18T10:00:46.448282+0000 | from_modifiers | INFO - Creating recipe from modifiers
manager stage: Modifiers initialized
2025-04-18T10:00:47.144978+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
manager stage: Modifiers finalized
2025-04-18T10:00:47.145196+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
2025-04-18T10:00:47.145245+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide
Checking whether model follows 2:4 sparsity structure: 100%|█████| 281/281 [00:24<00:00, 11.26it/s]
Calculating quantization compression ratio: 404it [00:00, 671.63it/s]
Quantized Compression: 100%|█████████████████████████████████████| 923/923 [00:29<00:00, 31.23it/s]
Running
oneshot(model=model, recipe=recipe)
always prints this warning:which I believe is coming from the old and deprecated pathway of saving compressed models with
oneshot(model=model, recipe=recipe, save_dir=<path>)
.Perhaps we should remove the warning as it is misleading given that we are now saving models with:
The text was updated successfully, but these errors were encountered: