Skip to content
Change the repository type filter

All

    Repositories list

    • server

      Public
      The Triton Inference Server provides an optimized cloud and edge inferencing solution.
      Python
      1.6k9.5k73476Updated Jul 30, 2025Jul 30, 2025
    • Python
      3128009Updated Jul 30, 2025Jul 30, 2025
    • core

      Public
      The core library and APIs implementing the Triton Inference Server.
      C++
      111145017Updated Jul 30, 2025Jul 30, 2025
    • The Triton backend for the ONNX Runtime.
      C++
      68156733Updated Jul 30, 2025Jul 30, 2025
    • Python
      24972213Updated Jul 29, 2025Jul 29, 2025
    • common

      Public
      Common source, scripts and utilities shared across all Triton repositories.
      C++
      747505Updated Jul 28, 2025Jul 28, 2025
    • The Triton backend for TensorRT.
      C++
      337701Updated Jul 28, 2025Jul 28, 2025
    • Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
      C++
      175629015Updated Jul 28, 2025Jul 28, 2025
    • The Triton backend that allows running GPU-accelerated data pre-processing pipelines implemented in DALI's python API.
      C++
      33136237Updated Jul 28, 2025Jul 28, 2025
    • The Triton TensorRT-LLM Backend
      Shell
      12687031123Updated Jul 25, 2025Jul 25, 2025
    • tutorials

      Public
      This repository contains tutorials and examples for Triton Inference Server
      Python
      124744816Updated Jul 21, 2025Jul 21, 2025
    • Triton CLI is an open source command line interface that enables users to create, deploy, and profile models served by the Triton Inference Server.
      Python
      56433Updated Jul 21, 2025Jul 21, 2025
    • Third-party source packages that are modified for use in Triton.
      C
      61704Updated Jul 21, 2025Jul 21, 2025
    • Simple Triton backend used for testing.
      C++
      5300Updated Jul 21, 2025Jul 21, 2025
    • An example Triton backend that demonstrates sending zero, one, or multiple responses for each request.
      C++
      7600Updated Jul 21, 2025Jul 21, 2025
    • TRITONCACHE implementation of a Redis cache
      C++
      41430Updated Jul 21, 2025Jul 21, 2025
    • The Triton backend for the PyTorch TorchScript models.
      C++
      5515705Updated Jul 21, 2025Jul 21, 2025
    • OpenVINO backend for Triton.
      C++
      173262Updated Jul 21, 2025Jul 21, 2025
    • Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
      Python
      78482296Updated Jul 21, 2025Jul 21, 2025
    • Implementation of a local in-memory cache for Triton Inference Server's TRITONCACHE API
      C++
      1510Updated Jul 21, 2025Jul 21, 2025
    • Example Triton backend that demonstrates most of the Triton Backend API.
      C++
      13700Updated Jul 21, 2025Jul 21, 2025
    • C++
      92104Updated Jul 21, 2025Jul 21, 2025
    • client

      Public
      Triton Python, C++ and Java client libraries, and GRPC-generated client examples for go, java and scala.
      Python
      2446364926Updated Jul 21, 2025Jul 21, 2025
    • The Triton repository agent that verifies model checksums.
      C++
      71100Updated Jul 21, 2025Jul 21, 2025
    • backend

      Public
      Common source, scripts and utilities for creating Triton backends.
      C++
      10033603Updated Jul 21, 2025Jul 21, 2025
    • FIL backend for the Triton Inference Server
      Jupyter Notebook
      3781520Updated Jul 18, 2025Jul 18, 2025
    • pytriton

      Public
      PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.
      Python
      5581390Updated Jul 2, 2025Jul 2, 2025
    • The Triton backend for TensorFlow.
      C++
      215202Updated Jun 18, 2025Jun 18, 2025
    • Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.
      Python
      2721040Updated Apr 22, 2025Apr 22, 2025
    • .github

      Public
      Community health files for NVIDIA Triton
      2200Updated Mar 27, 2025Mar 27, 2025