The SuperSONIC project implements server infrastructure for inference-as-a-service applications in large high energy physics (HEP) and multi-messenger astrophysics (MMA) experiments. The server infrastructure is designed for deployment at Kubernetes clusters equipped with GPUs.
Currently, SuperSONIC supports the following functionality:
- GPU inference-as-a-service via Nvidia Triton Inference Server
- Load balancing across many GPUs via Envoy Proxy
- Load-based autoscaling via KEDA
- Monitoring via Prometheus, Grafana, and OpenTelemetry
- Rate limiting
- Token-based authentication
Pre-requisites:
- a Kubernetes cluster with access to GPUs
- a Prometheus instance installed on the cluster, or Prometheus CRDs to deploy your own instance
- KEDA CRDs installed on the cluster (only if using autoscaling)
Install the latest released version from the Helm repository
helm repo add fastml https://fastmachinelearning.org/SuperSONIC
helm repo update
helm install <release-name> fastml/supersonic -n <namespace> -f <your-values.yaml>
Install directly from a GitHub branch/tag/commit
git clone https://github.com/fastmachinelearning/SuperSONIC.git
cd SuperSONIC
git checkout <branch-or-commit>
helm dependency build helm/supersonic
helm install <release-name> helm/supersonic -n <namespace> -f <your-values.yaml>
To construct the values.yaml
file for your application, follow Configuration guide.
The full list of configuration parameters is available in the Configuration reference.
CMS | ATLAS | IceCube | |
---|---|---|---|
Purdue Geddes | ✅ | - | - |
Purdue Anvil | ✅ | - | - |
NRP Nautilus | ✅ | ✅ | ✅ |
UChicago | - | ✅ | - |
Dmitry Kondratyev, Benedikt Riedel, Yuan-Tang Chou, Miles Cochran-Branson, Noah Paladino, David Schultz, Mia Liu, Javier Duarte, Philip Harris, and Shih-Chieh Hsu
SuperSONIC: Cloud-Native Infrastructure for ML Inferencing
In Practice and Experience in Advanced Research Computing 2025: The Power of Collaboration (PEARC '25)
Association for Computing Machinery, New York, NY, USA. Article 29, 1–5. 2025.
https://doi.org/10.1145/3708035.3736049