Examples

The following examples are added

Mistral FP16. #980
Phi2 Fine tuning example. #1030

Passes (optimization techniques)

QNNPreprocess: Add the configs which are added in onnxruntime nightly package.
GptqQuantizer: PTQ quantization using Hugging Face Optimum and export model with onnxruntime optimized kernel.
OnnxMatMul4Quantizer: Add matmul RTN/HQQ/GPTQ quant configs.
Move all pass need create inference session to run on target:
- IncQuantization
- OptimumMerging
- OrtTransformersOptimization
- VitisAIQuantization
- OrtPerfTuning

Engine

Support to pack AzureML output.
Remove execution_providers from engine config, typical config looks like:

"systems": {
    "local_system": {
        "type": "LocalSystem",
        "config": {
            "accelerators": [
                {
                    "device": "gpu",
                    "execution_providers": [
                        "CUDAExecutionProvider"
                    ]
                }
            ]
        }
    }
},
"engine": {
      "host": "local_system",
      "target": "local_system",
}

Workflows

Delayed python pass module loading and provide the option --package-config to let advanced users to write their individual pass module and corresponding dependencies.

Fix

Cannot load MLFlow model as from_pretrained_args is missed.
LoRA: Provide save_embedding_layers=False to saving the peft model. Otherwise, it defaults to "auto" which checks if the vocab size changed.
Update the model_rank file for zipfile packaging type. The model path now is the path relative to the output zip file.
Fix windows shutil.which return None when passing full python path.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Olive-ai 0.5.1

Examples

Passes (optimization techniques)

Engine

Workflows

Fix

Uh oh!