Olive is an easy-to-use hardware-aware model optimization tool that composes industry-leading techniques across model compression, optimization, and compilation. Given a model and targeted hardware, Olive composes the best suitable optimization techniques to output the most efficient model(s) for inferring on cloud or edge, while taking a set of constraints such as accuracy and latency into consideration.
Since every ML accelerator vendor implements their own acceleration tool chains to make the most of their hardware, hardware-aware optimizations are fragmented. With Olive, we can:
Reduce engineering effort for optimizing models for cloud and edge: Developers are required to learn and utilize multiple hardware vendor-specific toolchains in order to prepare and optimize their trained model for deployment. Olive aims to simplify the experience by aggregating and automating optimization techniques for the desired hardware targets.
Build up a unified optimization framework: Given that no single optimization technique serves all scenarios well, Olive enables an extensible framework that allows industry to easily plugin their optimization innovations. Olive can efficiently compose and tune integrated techniques for offering a ready-to-use E2E optimization solution.
- [ Mar 2024 ] Fine-tune SLM with Microsoft Olive
- [ Jan 2024 ] Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive
- [ Dec 2023 ] Windows AI Studio - VS Code Extension that uses Olive to fine tune models
- [ Nov 2023 ] Elevating the developer experience on Windows with new AI tools and productivity tools
- [ Nov 2023 ] Accelerating LLaMA-2 Inference with ONNX Runtime using Olive
- [ Nov 2023 ] Olive 0.4.0 released with support for LoRA fine-tuning and Llama2 optimizations
- [ Nov 2023 ] Intel and Microsoft Collaborate to Optimize DirectML for Intel® Arc™ Graphics Solutions using Olive
- [ Nov 2023 ] Running Olive Optimized Llama2 with Microsoft DirectML on AMD Radeon Graphics
- [ Oct 2023 ] AMD Microsoft Olive Optimizations for Stable Diffusion Performance Analysis
- [ Sep 2023 ] Running Optimized Automatic1111 Stable Diffusion WebUI on AMD GPUs
- [ Jul 2023 ] Build accelerated AI apps for NPUs with Olive
- [ Jun 2023 ] Olive: A user-friendly toolchain for hardware-aware model optimization
- [ May 2023 ] Optimize DirectML performance with Olive
- [ May 2023 ] Optimize Stable Diffusion Using Olive
- Documentation: https://microsoft.github.io/Olive
- Examples: examples
We recommend installing Olive in a virtual environment or a conda environment. Olive is installed using pip.
Create a virtual/conda environment with the desired version of Python and activate it.
You will need to install a build of onnxruntime. You can install the desired build separately but public versions of onnxruntime can also be installed as extra dependencies during Olive installation.
Olive is available for installation from PyPI.
pip install olive-ai
With onnxruntime (Default CPU):
pip install olive-ai[cpu]
With onnxruntime-gpu:
pip install olive-ai[gpu]
With onnxruntime-directml:
pip install olive-ai[directml]
Olive has optional dependencies that can be installed to enable additional features. Please refer to extra dependencies for the list of extras and their dependencies.
We’d love to embrace your contribution to Olive. Please refer to CONTRIBUTING.md.
Copyright (c) Microsoft Corporation. All rights reserved.
Licensed under the MIT License.