Skip to content

NVIDIA Deep Research Agent Blueprint on Amazon EKS #185

@olaoyea4

Description

@olaoyea4

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

What is the outcome that you are trying to reach?

Currently, there is no standardized, end-to-end guide for deploying the NVIDIA AI-Q Research Assistant Blueprint on Amazon EKS. Users who want to leverage this advanced RAG architecture on AWS face the significant problem of manually configuring a complex environment. This includes provisioning multiple, specialized GPU node groups, setting up the NVIDIA GPU Operator, integrating with AWS services like OpenSearch Serverless and Load Balancers, and correctly configuring IAM Roles for Service Accounts (IRSA) for secure access. This complexity creates a high barrier to adoption for teams wanting to build powerful research assistants on EKS.

The goal is to publish a repeatable, end-to-end solution for deploying the NVIDIA Deep Research Agent Blueprint on Amazon EKS. This will provide the community with a powerful, enterprise-grade deep research agent solution using NVIDIA NIMs on AWS infrastructure.

Describe the solution you would like

I propose adding a new solution that provides a comprehensive guide and assets for deploying the NVIDIA Deep research agent blueprint. This solution will include:

Infrastructure as Code to provision an EKS cluster with multiple specialized GPU node groups for all the NVIDIA NIMs int he solution

Kubernetes Manifests/Helm Charts to deploy the solution, including the 49B Llama Nemotron model,, Nemo retriever NIMs, data ingestion services, and other components.

Integration for key AWS services, such as using AWS OpenSearch Serverless for vector storage, IAM Roles for Service Accounts (IRSA) for secure access, and AWS Load Balancer Controller for exposing services.

Describe alternatives you have considered

The alternative is for individual users to manually adapt NVIDIA's general deployment guides for EKS. This is a complex and time-consuming process that involves significant manual configuration of networking, IAM policies, OIDC providers, GPU drivers, and Kubernetes storage. This manual approach is error-prone and presents a high barrier to entry. A dedicated ai-on-eks solution would drastically simplify and standardize the process.

Additional context

Arch diagram provided

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions