Skip to content
Change the repository type filter

All

    Repositories list

    • Variant optimization autoscaler for distributed inference workloads
      Go
      1821636Updated Nov 15, 2025Nov 15, 2025
    • llm-d-fast-model-actuation

      Public
      Go
      77386Updated Nov 14, 2025Nov 14, 2025
    • helm charts for deploying models with llm-d
      Smarty
      352386Updated Nov 12, 2025Nov 12, 2025
    • hermes

      Public
      Hermes is a cluster configuration scanning and self-test generation tool for llm-d inference workloads
      Rust
      0000Updated Nov 12, 2025Nov 12, 2025
    • llm-d helm charts and deployment examples
      Shell
      45461519Updated Oct 2, 2025Oct 2, 2025
    • llm-d-ci

      Public
      Shell
      2200Updated Aug 6, 2025Aug 6, 2025
    • ig-wva

      Public
      Workload Variant Autoscaler is a service to compute the cost-optimal provisioning of heterogeneous accelerators for inference workloads with varying request latency objectives
      Jupyter Notebook
      1101Updated Jul 11, 2025Jul 11, 2025