v0.1.0a3
Pre-releaseSummary
This pre-release adds powerful new capabilities for high-performance inference and post-processing:
- ONNX/TensorRT Export: Export trained models to optimized formats for 3-6x faster inference
- Post-Inference Filtering: Remove overlapping/duplicate predictions using IOU or OKS similarity
- Improved WandB Logging: Better metrics organization and run naming
For the full list of major features, breaking changes, and improvements introduced in the v0.1.0 series, see the v0.1.0a0 release notes.
What's New in v0.1.0a3
Features
ONNX/TensorRT Export Module (#418)
A complete model export system for high-performance inference:
# Export to ONNX
sleap-nn export /path/to/model -o exports/my_model --format onnx
# Export to both ONNX and TensorRT FP16
sleap-nn export /path/to/model -o exports/my_model --format both
# Run inference on exported model
sleap-nn predict exports/my_model video.mp4 -o predictions.slpPerformance Benchmarks (NVIDIA RTX A6000):
Batch size 1 (latency-optimized):
| Model | Resolution | PyTorch | ONNX-GPU | TensorRT FP16 | Speedup |
|---|---|---|---|---|---|
| single_instance | 192×192 | 1.8 ms | 1.3 ms | 0.31 ms | 5.9x |
| centroid | 1024×1024 | 2.5 ms | 2.7 ms | 0.77 ms | 3.2x |
| topdown | 1024×1024 | 11.4 ms | 9.7 ms | 2.31 ms | 4.9x |
| bottomup | 1024×1280 | 12.3 ms | 9.6 ms | 2.52 ms | 4.9x |
| multiclass_topdown | 1024×1024 | 8.3 ms | 9.1 ms | 1.84 ms | 4.5x |
| multiclass_bottomup | 1024×1024 | 9.4 ms | 9.4 ms | 2.64 ms | 3.6x |
Batch size 8 (throughput-optimized):
| Model | Resolution | PyTorch | ONNX-GPU | TensorRT FP16 | Speedup |
|---|---|---|---|---|---|
| single_instance | 192×192 | 3,111 FPS | 3,165 FPS | 11,039 FPS | 3.5x |
| centroid | 1024×1024 | 453 FPS | 474 FPS | 1,829 FPS | 4.0x |
| topdown | 1024×1024 | 94 FPS | 122 FPS | 525 FPS | 5.6x |
| bottomup | 1024×1280 | 113 FPS | 121 FPS | 524 FPS | 4.6x |
| multiclass_topdown | 1024×1024 | 127 FPS | 145 FPS | 735 FPS | 5.8x |
| multiclass_bottomup | 1024×1024 | 116 FPS | 120 FPS | 470 FPS | 4.1x |
Speedup is relative to PyTorch baseline.
Supported model types:
- Single Instance, Centroid, Centered Instance
- Top-Down (combined centroid + instance)
- Bottom-Up (multi-instance with PAF grouping)
- Multi-class Top-Down and Bottom-Up (with identity classification)
New CLI commands:
sleap-nn export- Export models to ONNX/TensorRTsleap-nn predict- Run inference on exported models
New optional dependencies:
uv pip install "sleap-nn[export]" # ONNX CPU inference
uv pip install "sleap-nn[export-gpu]" # ONNX GPU inference
uv pip install "sleap-nn[tensorrt]" # TensorRT supportSee the Export Guide for full documentation.
Post-Inference Filtering for Overlapping Instances (#420)
New capability to remove duplicate/overlapping pose predictions after model inference:
# Filter with IOU method (default)
sleap-nn track -i video.mp4 -m model/ --filter_overlapping
# Use OKS method with custom threshold
sleap-nn track -i video.mp4 -m model/ \
--filter_overlapping \
--filter_overlapping_method oks \
--filter_overlapping_threshold 0.5New CLI options for sleap-nn track:
| Option | Default | Description |
|---|---|---|
--filter_overlapping |
False |
Enable filtering using greedy NMS |
--filter_overlapping_method |
iou |
Similarity method: iou (bbox) or oks (keypoints) |
--filter_overlapping_threshold |
0.8 |
Similarity threshold (lower = more aggressive) |
Programmatic API:
from sleap_nn.inference.postprocessing import filter_overlapping_instances
labels = filter_overlapping_instances(labels, threshold=0.5, method="oks")Why use this? Previously, IOU-based filtering only existed in the tracking pipeline. This feature allows filtering overlapping predictions without requiring --tracking.
Improvements
WandB Run Naming and Metrics Logging (#417)
- Fixed run naming: WandB runs now correctly use auto-generated run names
- Improved metrics organization: All metrics use
/separator for automatic panel grouping in WandB UI:train/loss,train/lr- Training metrics (epoch x-axis)val/loss- Validation metrics (epoch x-axis)eval/val/- Epoch-end evaluation metricseval/test.X/- Post-training test set metrics
- New metrics logged:
train/lr- Learning rate (useful for monitoring LR schedulers)PCK@5,PCK@10- PCK at 5px and 10px thresholdsdistance/p95,distance/p99- Additional distance percentiles
Documentation
- Exporting Guide (#419): Added comprehensive export documentation to How-to guides navigation
Installation
This is an alpha pre-release. Pre-releases are excluded by default per PEP 440 - you must explicitly opt in.
Install with uv (Recommended)
# With --prerelease flag (requires uv 0.9.20+)
uv tool install sleap-nn[torch] --torch-backend auto --prerelease=allow
# Or pin to exact version
uv tool install "sleap-nn[torch]==0.1.0a3" --torch-backend autoRun with uvx (One-off execution)
uvx --from "sleap-nn[torch]" --prerelease=allow --torch-backend auto sleap-nn systemVerify Installation
sleap-nn --version
# Expected output: 0.1.0a3
sleap-nn system
# Shows full system diagnostics including GPU infoUpgrading from v0.1.0a2
If you already have v0.1.0a2 installed with --prerelease=allow:
# Simple upgrade (retains original settings like --prerelease=allow)
uv tool upgrade sleap-nnTo force a complete reinstall:
uv tool install sleap-nn[torch] --torch-backend auto --prerelease=allow --forceChangelog
| PR | Category | Title |
|---|---|---|
| #417 | Improvement | Fix wandb run naming and improve metrics logging |
| #418 | Feature | Add ONNX/TensorRT export module |
| #419 | Documentation | Add Exporting guide to How-to guides section |
| #420 | Feature | Add post-inference filtering for overlapping instances |
Full Changelog: v0.1.0a2...v0.1.0a3