Release v0.1.0a3 · talmolab/sleap-nn

Summary

This pre-release adds powerful new capabilities for high-performance inference and post-processing:

ONNX/TensorRT Export: Export trained models to optimized formats for 3-6x faster inference
Post-Inference Filtering: Remove overlapping/duplicate predictions using IOU or OKS similarity
Improved WandB Logging: Better metrics organization and run naming

For the full list of major features, breaking changes, and improvements introduced in the v0.1.0 series, see the v0.1.0a0 release notes.

What's New in v0.1.0a3

Features

ONNX/TensorRT Export Module (#418)

A complete model export system for high-performance inference:

# Export to ONNX
sleap-nn export /path/to/model -o exports/my_model --format onnx

# Export to both ONNX and TensorRT FP16
sleap-nn export /path/to/model -o exports/my_model --format both

# Run inference on exported model
sleap-nn predict exports/my_model video.mp4 -o predictions.slp

Performance Benchmarks (NVIDIA RTX A6000):

Batch size 1 (latency-optimized):

Model	Resolution	PyTorch	ONNX-GPU	TensorRT FP16	Speedup
single_instance	192×192	1.8 ms	1.3 ms	0.31 ms	5.9x
centroid	1024×1024	2.5 ms	2.7 ms	0.77 ms	3.2x
topdown	1024×1024	11.4 ms	9.7 ms	2.31 ms	4.9x
bottomup	1024×1280	12.3 ms	9.6 ms	2.52 ms	4.9x
multiclass_topdown	1024×1024	8.3 ms	9.1 ms	1.84 ms	4.5x
multiclass_bottomup	1024×1024	9.4 ms	9.4 ms	2.64 ms	3.6x

Batch size 8 (throughput-optimized):

Model	Resolution	PyTorch	ONNX-GPU	TensorRT FP16	Speedup
single_instance	192×192	3,111 FPS	3,165 FPS	11,039 FPS	3.5x
centroid	1024×1024	453 FPS	474 FPS	1,829 FPS	4.0x
topdown	1024×1024	94 FPS	122 FPS	525 FPS	5.6x
bottomup	1024×1280	113 FPS	121 FPS	524 FPS	4.6x
multiclass_topdown	1024×1024	127 FPS	145 FPS	735 FPS	5.8x
multiclass_bottomup	1024×1024	116 FPS	120 FPS	470 FPS	4.1x

Speedup is relative to PyTorch baseline.

Supported model types:

Single Instance, Centroid, Centered Instance
Top-Down (combined centroid + instance)
Bottom-Up (multi-instance with PAF grouping)
Multi-class Top-Down and Bottom-Up (with identity classification)

New CLI commands:

sleap-nn export - Export models to ONNX/TensorRT
sleap-nn predict - Run inference on exported models

New optional dependencies:

uv pip install "sleap-nn[export]"      # ONNX CPU inference
uv pip install "sleap-nn[export-gpu]"  # ONNX GPU inference
uv pip install "sleap-nn[tensorrt]"    # TensorRT support

See the Export Guide for full documentation.

Post-Inference Filtering for Overlapping Instances (#420)

New capability to remove duplicate/overlapping pose predictions after model inference:

# Filter with IOU method (default)
sleap-nn track -i video.mp4 -m model/ --filter_overlapping

# Use OKS method with custom threshold
sleap-nn track -i video.mp4 -m model/ \
    --filter_overlapping \
    --filter_overlapping_method oks \
    --filter_overlapping_threshold 0.5

New CLI options for sleap-nn track:

Option	Default	Description
`--filter_overlapping`	`False`	Enable filtering using greedy NMS
`--filter_overlapping_method`	`iou`	Similarity method: `iou` (bbox) or `oks` (keypoints)
`--filter_overlapping_threshold`	`0.8`	Similarity threshold (lower = more aggressive)

Programmatic API:

from sleap_nn.inference.postprocessing import filter_overlapping_instances

labels = filter_overlapping_instances(labels, threshold=0.5, method="oks")

Why use this? Previously, IOU-based filtering only existed in the tracking pipeline. This feature allows filtering overlapping predictions without requiring --tracking.

Improvements

WandB Run Naming and Metrics Logging (#417)

Fixed run naming: WandB runs now correctly use auto-generated run names
Improved metrics organization: All metrics use / separator for automatic panel grouping in WandB UI:
- train/loss, train/lr - Training metrics (epoch x-axis)
- val/loss - Validation metrics (epoch x-axis)
- eval/val/ - Epoch-end evaluation metrics
- eval/test.X/ - Post-training test set metrics
New metrics logged:
- train/lr - Learning rate (useful for monitoring LR schedulers)
- PCK@5, PCK@10 - PCK at 5px and 10px thresholds
- distance/p95, distance/p99 - Additional distance percentiles

Documentation

Exporting Guide (#419): Added comprehensive export documentation to How-to guides navigation

Installation

This is an alpha pre-release. Pre-releases are excluded by default per PEP 440 - you must explicitly opt in.

Install with uv (Recommended)

# With --prerelease flag (requires uv 0.9.20+)
uv tool install sleap-nn[torch] --torch-backend auto --prerelease=allow

# Or pin to exact version
uv tool install "sleap-nn[torch]==0.1.0a3" --torch-backend auto

Run with uvx (One-off execution)

uvx --from "sleap-nn[torch]" --prerelease=allow --torch-backend auto sleap-nn system

Verify Installation

sleap-nn --version
# Expected output: 0.1.0a3

sleap-nn system
# Shows full system diagnostics including GPU info

Upgrading from v0.1.0a2

If you already have v0.1.0a2 installed with --prerelease=allow:

# Simple upgrade (retains original settings like --prerelease=allow)
uv tool upgrade sleap-nn

To force a complete reinstall:

uv tool install sleap-nn[torch] --torch-backend auto --prerelease=allow --force

Changelog

PR	Category	Title
#417	Improvement	Fix wandb run naming and improve metrics logging
#418	Feature	Add ONNX/TensorRT export module
#419	Documentation	Add Exporting guide to How-to guides section
#420	Feature	Add post-inference filtering for overlapping instances

Full Changelog: v0.1.0a2...v0.1.0a3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.1.0a3

Choose a tag to compare

Sorry, something went wrong.