Skip to content

Olive-ai 0.5.1

Compare
Choose a tag to compare
@trajepl trajepl released this 07 Apr 08:18

Examples

The following examples are added

  • Mistral FP16. #980
  • Phi2 Fine tuning example. #1030

Passes (optimization techniques)

  • QNNPreprocess: Add the configs which are added in onnxruntime nightly package.
  • GptqQuantizer: PTQ quantization using Hugging Face Optimum and export model with onnxruntime optimized kernel.
  • OnnxMatMul4Quantizer: Add matmul RTN/HQQ/GPTQ quant configs.
  • Move all pass need create inference session to run on target:
    • IncQuantization
    • OptimumMerging
    • OrtTransformersOptimization
    • VitisAIQuantization
    • OrtPerfTuning

Engine

  • Support to pack AzureML output.
  • Remove execution_providers from engine config, typical config looks like:
"systems": {
    "local_system": {
        "type": "LocalSystem",
        "config": {
            "accelerators": [
                {
                    "device": "gpu",
                    "execution_providers": [
                        "CUDAExecutionProvider"
                    ]
                }
            ]
        }
    }
},
"engine": {
      "host": "local_system",
      "target": "local_system",
}

Workflows

  • Delayed python pass module loading and provide the option --package-config to let advanced users to write their individual pass module and corresponding dependencies.

Fix

  • Cannot load MLFlow model as from_pretrained_args is missed.
  • LoRA: Provide save_embedding_layers=False to saving the peft model. Otherwise, it defaults to "auto" which checks if the vocab size changed.
  • Update the model_rank file for zipfile packaging type. The model path now is the path relative to the output zip file.
  • Fix windows shutil.which return None when passing full python path.