usls is a cross-platform Rust library powered by ONNX Runtime for efficient inference of SOTA vision and vision-language models (typically under 1B parameters).
Run the YOLO demo to explore various YOLO-Series models with different tasks, precision, and execution providers:
- Tasks:
detect,segment,pose,classify,obb - Versions:
YOLOv5,YOLOv6,YOLOv7,YOLOv8,YOLOv9,YOLOv10,YOLO11,YOLOv12,YOLOv13 - Scales:
n,s,m,l,x - Precision:
fp32,fp16,q8,q4,q4f16,bnb4 - Execution Providers:
CPU,CUDA,TensorRT,CoreML,OpenVINO, and more
# CPU: Object detection, YOLOv8n, FP16
cargo run -r --example yolo -- --task detect --ver 8 --scale n --dtype fp16
# NVIDIA CUDA: Instance segmentation, YOLO11m
cargo run -r -F cuda --example yolo -- --task segment --ver 11 --scale m --device cuda:0
# NVIDIA TensorRT
cargo run -r -F tensorrt --example yolo -- --device tensorrt:0
# Apple Silicon CoreML
cargo run -r -F coreml --example yolo -- --device coreml
# Intel OpenVINO: CPU/GPU/VPU acceleration
cargo run -r -F openvino -F ort-load-dynamic --example yolo -- --device openvino:CPU
# Show all available options
cargo run -r --example yolo -- --helpSee YOLO Examples for more details and use cases.
Add the following to your Cargo.toml:
[dependencies]
# Use GitHub version
usls = { git = "https://github.com/jamjamjon/usls", features = [ "cuda" ] }
# Alternative: Use crates.io version
usls = { version = "latest-version", features = [ "cuda" ] }β Features in italics are enabled by default.
-
ort-download-binaries: Auto-download ONNX Runtime binaries from pyke.ort-load-dynamic: Linking ONNX Runtime by your self. Use this ifpykedoesn't provide prebuilt binaries for your platform or you want to link your local ONNX Runtime library. See Linking Guide for more details.viewer: Image/video visualization (minifb). Similar to OpenCVimshow(). See example.video: Video I/O support (video-rs). Enable this to read/write video streams. See examplehf-hub: Hugging Face Hub support for downloading models from Hugging Face repositories.tokenizers: Tokenizer support for vision-language models. Automatically enabled when using vision-language model features (blip, clip, florence2, grounding-dino, fastvlm, moondream2, owl, smolvlm, trocr, yoloe).slsl: SLSL tensor library support. Automatically enabled when usingyoloorclipfeatures.
-
Hardware acceleration for inference.
cuda,tensorrt: NVIDIA GPU accelerationcoreml: Apple Silicon accelerationopenvino: Intel CPU/GPU/VPU accelerationonednn,directml,xnnpack,rocm,cann,rknpu,acl,nnapi,armnn,tvm,qnn,migraphx,vitis,azure: Various hardware/platform support
See ONNX Runtime docs and ORT performance guide for details.
-
Almost each model is a separate feature. Enable only what you need to reduce compile time and binary size.
yolo,sam,clip,image-classifier,dino,rtmpose,rtdetr,db, ...- All models:
all-models(enables all model features)
See Supported Models for the complete list with feature names.
π View all models (Click to expand)
| Model | Task / Description | Feature | Example |
|---|---|---|---|
| BEiT | Image Classification | image-classifier |
demo |
| ConvNeXt | Image Classification | image-classifier |
demo |
| FastViT | Image Classification | image-classifier |
demo |
| MobileOne | Image Classification | image-classifier |
demo |
| DeiT | Image Classification | image-classifier |
demo |
| DINOv2 | Vision Embedding | dino |
demo |
| DINOv3 | Vision Embedding | dino |
demo |
| YOLOv5 | Image Classification Object Detection Instance Segmentation |
yolo |
demo |
| YOLOv6 | Object Detection | yolo |
demo |
| YOLOv7 | Object Detection | yolo |
demo |
| YOLOv8 YOLO11 |
Object Detection Instance Segmentation Image Classification Oriented Object Detection Keypoint Detection |
yolo |
demo |
| YOLOv9 | Object Detection | yolo |
demo |
| YOLOv10 | Object Detection | yolo |
demo |
| YOLOv12 | Image Classification Object Detection Instance Segmentation |
yolo |
demo |
| YOLOv13 | Object Detection | yolo |
demo |
| RT-DETR | Object Detection | rtdetr |
demo |
| RF-DETR | Object Detection | rfdetr |
demo |
| PP-PicoDet | Object Detection | picodet |
demo |
| DocLayout-YOLO | Object Detection | picodet |
demo |
| D-FINE | Object Detection | rtdetr |
demo |
| DEIM | Object Detection | rtdetr |
demo |
| DEIMv2 | Object Detection | rtdetr |
demo |
| RTMPose | Keypoint Detection | rtmpose |
demo |
| DWPose | Keypoint Detection | rtmpose |
demo |
| RTMW | Keypoint Detection | rtmpose |
demo |
| RTMO | Keypoint Detection | rtmo |
demo |
| SAM | Segment Anything | sam |
demo |
| SAM2 | Segment Anything | sam |
demo |
| MobileSAM | Segment Anything | sam |
demo |
| EdgeSAM | Segment Anything | sam |
demo |
| SAM-HQ | Segment Anything | sam |
demo |
| FastSAM | Instance Segmentation | yolo |
demo |
| YOLO-World | Open-Set Detection With Language | yolo |
demo |
| YOLOE | Open-Set Detection And Segmentation | yoloe |
demo-prompt-free demo-prompt(visual & textual) |
| GroundingDINO | Open-Set Detection With Language | grounding-dino |
demo |
| CLIP | Vision-Language Embedding | clip |
demo |
| jina-clip-v1 | Vision-Language Embedding | clip |
demo |
| jina-clip-v2 | Vision-Language Embedding | clip |
demo |
| mobileclip & mobileclip2 | Vision-Language Embedding | clip |
demo |
| BLIP | Image Captioning | blip |
demo |
| DB(PaddleOCR-Det) | Text Detection | db |
demo |
| FAST | Text Detection | db |
demo |
| LinkNet | Text Detection | db |
demo |
| SVTR(PaddleOCR-Rec) | Text Recognition | svtr |
demo |
| SLANet | Tabel Recognition | slanet |
demo |
| TrOCR | Text Recognition | trocr |
demo |
| YOLOPv2 | Panoptic Driving Perception | yolop |
demo |
| DepthAnything v1 DepthAnything v2 |
Monocular Depth Estimation | depth-anything |
demo |
| DepthPro | Monocular Depth Estimation | depth-pro |
demo |
| MODNet | Image Matting | modnet |
demo |
| Sapiens | Foundation for Human Vision Models | sapiens |
demo |
| Florence2 | A Variety of Vision Tasks | florence2 |
demo |
| Moondream2 | Open-Set Object Detection Open-Set Keypoints Detection Image Caption Visual Question Answering |
moondream2 |
demo |
| OWLv2 | Open-Set Object Detection | owl |
demo |
| SmolVLM(256M, 500M) | Visual Question Answering | smolvlm |
demo |
| FastVLM(0.5B) | Vision Language Models | fastvlm |
demo |
| RMBG(1.4, 2.0) | Image Segmentation Background Removal |
rmbg |
demo |
| BEN2 | Image Segmentation Background Removal |
ben2 |
demo |
| MediaPipe: Selfie-segmentation | Image Segmentation | mediapipe-segmenter |
demo |
| Swin2SR | Image Super-Resolution and Restoration | swin2sr |
demo |
| APISR | Real-World Anime Super-Resolution | apisr |
demo |
| RAM & RAM++ | Image Tagging | ram |
demo |
See issues or open a new discussion.
Contributions are welcome! If you have suggestions, bug reports, or want to add new features or models, feel free to open an issue or submit a pull request.
This project is built on top of ort (ONNX Runtime for Rust), which provides seamless Rust bindings for ONNX Runtime. Special thanks to the ort maintainers.
Thanks to all the open-source libraries and their maintainers that make this project possible. See Cargo.toml for a complete list of dependencies.
This project is licensed under LICENSE.