Skip to content

Conversation

atchernych
Copy link
Contributor

@atchernych atchernych commented Oct 14, 2025

Overview:

DEP-508

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

  • closes GitHub issue: #xxx

Summary by CodeRabbit

  • New Features
    • Added GAIE deployment path with a new deploy script, Kubernetes manifests, and a run flag to enable GAIE.
    • Introduced two routing modes: Dynamo-backed (default) and black-box EPP, with easy switching via configuration.
  • Documentation
    • Expanded setup and usage guides with step-by-step instructions, examples, and notes for both routing modes.
    • Added verification steps and updated commands for installation and deployment.
  • Refactor
    • Simplified Helm values and templates; consolidated configuration, new defaults, and clearer environment settings.

@atchernych atchernych requested review from a team as code owners October 14, 2025 02:25
@github-actions github-actions bot added the feat label Oct 14, 2025
Copy link
Contributor

coderabbitai bot commented Oct 14, 2025

Walkthrough

Refactors Inference Gateway to default to Dynamo EPP routing with a configurable fallback to black-box EPP. Updates Helm chart templates and values to use epp.useDynamo and new epp.dynamo.* keys. Adds GAIE deployment scripts, GAIE-specific Kubernetes manifests for a vLLM aggregation recipe, and documentation updates, plus a GAIE flag path in recipes/run.sh.

Changes

Cohort / File(s) Summary
Inference Gateway: Docs & Helm
deploy/inference-gateway/README.md, deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml, deploy/inference-gateway/helm/dynamo-gaie/values.yaml, deploy/inference-gateway/values-blackbox-epp.yaml
Default to Dynamo EPP; introduce epp.useDynamo and epp.dynamo.* config; simplify image selection; guard mounts/env by useDynamo; add black-box values file; expand README with two-mode setup and config/env details; update install steps and examples.
Recipes: Docs
recipes/README.md
Adds GAIE usage notes, env guidance, and --gaie example under running examples.
Recipes: vLLM Agg tweak
recipes/llama-3-70b/vllm/agg/deploy.yaml
Removes dynamoNamespace from Frontend service spec; no other changes.
GAIE Deploy Script
recipes/llama-3-70b/vllm/agg/gaie/deploy.sh
New Helm deployment script for GAIE with pre-flight checks, namespace ensure, dependency build, values aggregation, and passthrough overrides.
GAIE Manifests (vLLM agg)
recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/*
Adds kustomize-based manifests: RBAC, EPP ConfigMap/Deployment/Service, InferencePool/Model, HTTPRoute, kustomization with namespace replacements; README with apply steps and prerequisites.
GAIE Model Config
recipes/llama-3-70b/vllm/agg/gaie/model-gaie.yaml
New GAIE defaults for dynamoNamespace and model identifiers.
Run Orchestration
recipes/run.sh
Adds --gaie flag; branches to GAIE deploy script and exits early when enabled; preserves existing non-GAIE flow.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor User
  participant run.sh as recipes/run.sh
  participant GAIE Deploy as gaie/deploy.sh
  participant Helm
  participant K8s

  User->>run.sh: ./run.sh [--gaie] [args]
  alt GAIE enabled
    run.sh->>GAIE Deploy: exec deploy.sh (passthrough args)
    GAIE Deploy->>GAIE Deploy: Pre-flight checks (helm/kubectl)
    GAIE Deploy->>K8s: Ensure namespace
    GAIE Deploy->>Helm: helm dependency build
    GAIE Deploy->>Helm: helm upgrade --install (GAIE chart, values)
    Helm->>K8s: Apply/Update resources
    GAIE Deploy-->>run.sh: Success/Failure
    run.sh-->>User: Exit after GAIE deployment
  else GAIE disabled
    run.sh->>K8s: Standard deploy
    run.sh->>User: Continue to benchmark
  end
Loading
sequenceDiagram
  autonumber
  participant Values as values.yaml
  participant Template as dynamo-epp.yaml
  participant Pod as EPP Pod

  Values-->>Template: epp.useDynamo = true<br/>epp.dynamo.* set
  alt useDynamo = true
    Template-->>Pod: Mount configMap, set env (ETCD, NATS, DYN_*, KV)
    Template-->>Pod: Use extension.image, epp.configFile
  else useDynamo = false
    Template-->>Pod: Omit Dynamo-specific mounts/env
    Template-->>Pod: Use black-box EPP image
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

A rabbit taps helm charts with glee,
Flips a flag: “GAIE or not to be?”
Dynamo routes hop true and fine,
Black-box paths wait on the line.
Kustomize fields, namespaces align—
Ship it swift; carrots at nine! 🥕🐇

Pre-merge checks

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Description Check ⚠️ Warning The pull request description only contains a placeholder reference (“DEP-508”) and leaves the Details and Where should the reviewer start sections unfilled, so it does not adhere to the repository’s description template or provide sufficient context for reviewers. Please populate the Overview with a high-level summary of the changes, fill out the Details section with key modifications, specify files reviewers should inspect, and confirm or update the Related Issues section to fully align with the template.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The title succinctly captures the primary enhancement of streamlining the GAIE deployment recipe and aligns with the extensive refactoring and documentation updates introduced in this pull request.
Docstring Coverage ✅ Passed No functions found in the changes. Docstring coverage check skipped.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
recipes/run.sh (1)

31-56: Update usage/help to include --gaie and GAIE/EPP_IMAGE envs

Add --gaie to Optional, and document EPP_IMAGE (required by GAIE deploy script). Example:

   echo "Optional:"
   echo "  --namespace <ns>   Kubernetes namespace (default: dynamo)"
+  echo "  --gaie             Use GAIE integration path"
+  echo "  --epp-image <img>  EPP image (required when --gaie)"
   echo "  --skip-model-cache Skip model downloading (assumes model cache already exists)"
...
   echo "Environment Variables:"
   echo "  NAMESPACE          Kubernetes namespace (default: dynamo)"
+  echo "  EPP_IMAGE          EPP image (alternative to --epp-image) used by GAIE deploy"

Add CLI parsing:

         --namespace)
             ...
             ;;
+        --epp-image)
+            if [ "$2" ]; then
+                EPP_IMAGE=$2
+                shift 2
+            else
+                missing_requirement "$1"
+            fi
+            ;;

Also applies to: 68-76, 123-145

deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml (1)

70-95: Restore scoped Dynamo variables before using them

$ns, $comp, and $kv are declared inside the first if block but then referenced in a later if. In Go/Helm templates, variables declared inside a block are scoped to that block, so the later references decay to <no value>. When epp.useDynamo is true we end up projecting empty env vars (ETCD is fine, but DYN_NAMESPACE, DYNAMO_COMPONENT, DYNAMO_KV_BLOCK_SIZE all render blank), so the EPP container never receives the required Dynamo configuration. Please lift the assignments out of the inner block (or keep everything inside a single guard) so the values survive until the env section. Suggested fix:

-        {{- if .Values.epp.useDynamo }}
-        {{- $ns := required "set epp.dynamo.namespace via values" .Values.epp.dynamo.namespace -}}
-        {{- $comp := default "backend" .Values.epp.dynamo.component -}}
-        {{- $kv := default "16" .Values.epp.dynamo.kvBlockSize -}}
-
-        {{- end }}
+        {{- $ns := "" -}}
+        {{- $comp := "" -}}
+        {{- $kv := "" -}}
+        {{- if .Values.epp.useDynamo }}
+        {{- $ns = required "set epp.dynamo.namespace via values" .Values.epp.dynamo.namespace -}}
+        {{- $comp = default "backend" .Values.epp.dynamo.component -}}
+        {{- $kv = default "16" .Values.epp.dynamo.kvBlockSize -}}
+        {{- end }}
deploy/inference-gateway/README.md (1)

270-274: Fix the kubectl command with duplicate output flags

kubectl get svc … -o yaml -o jsonpath=… won’t run because kubectl only accepts a single -o. The second -o triggers “flag provided but not defined: -o”, so readers can’t actually retrieve the ClusterIP. Please drop the extra -o yaml (or split into two commands). For example:

-kubectl get svc inference-gateway -n my-model -o yaml -o jsonpath='{.spec.clusterIP}'
+kubectl get svc inference-gateway -n my-model -o jsonpath='{.spec.clusterIP}'
🧹 Nitpick comments (2)
recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/README.md (1)

65-75: Fix kustomize command syntax

Command should be:

  • kustomize edit set namespace "${NAMESPACE}"
    (no -n flag)
recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-configmap.yaml (1)

32-49: Unused dyn-pre plugin; add to profile to take effect

You define dynamo-inject-workerid as dyn-pre but never reference it in schedulingProfiles. Add it before dyn-kv if required.

Example:

     schedulingProfiles:
       - name: default
         plugins:
+          - pluginRef: dyn-pre
+            weight: 1
           - pluginRef: dyn-kv
             weight: 1
           - pluginRef: picker
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd2389b and 6e1684c.

📒 Files selected for processing (19)
  • deploy/inference-gateway/README.md (11 hunks)
  • deploy/inference-gateway/helm/dynamo-gaie/templates/dynamo-epp.yaml (3 hunks)
  • deploy/inference-gateway/helm/dynamo-gaie/values.yaml (1 hunks)
  • deploy/inference-gateway/values-blackbox-epp.yaml (1 hunks)
  • recipes/README.md (1 hunks)
  • recipes/llama-3-70b/vllm/agg/deploy.yaml (0 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/deploy.sh (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/README.md (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/cluster-role-binding.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/cluster-role.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-configmap.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-deployment.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-service.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/http-route.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/inference-model.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/inference-pool.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/kustomization.yaml (1 hunks)
  • recipes/llama-3-70b/vllm/agg/gaie/model-gaie.yaml (1 hunks)
  • recipes/run.sh (4 hunks)
💤 Files with no reviewable changes (1)
  • recipes/llama-3-70b/vllm/agg/deploy.yaml
🧰 Additional context used
🪛 Checkov (3.2.334)
recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-deployment.yaml

[medium] 17-102: Containers should not run with allowPrivilegeEscalation

(CKV_K8S_20)


[medium] 17-102: Minimize the admission of root containers

(CKV_K8S_23)

🪛 markdownlint-cli2 (0.18.1)
deploy/inference-gateway/README.md

169-169: Heading levels should only increment by one level at a time
Expected: h4; Actual: h5

(MD001, heading-increment)


169-169: Heading style
Expected: atx; Actual: atx_closed

(MD003, heading-style)

🪛 Shellcheck (0.11.0)
recipes/run.sh

[error] 102-102: Couldn't parse this case item. Fix to allow more checks.

(SC1073)


[error] 102-102: Expected ) to open a new case item. Fix any mentioned problems and try again.

(SC1072)


[error] 102-102: Did you forget to move the ;; after extending this case item?

(SC1085)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (6)
  • GitHub Check: trtllm (arm64)
  • GitHub Check: vllm (arm64)
  • GitHub Check: vllm (amd64)
  • GitHub Check: sglang
  • GitHub Check: trtllm (amd64)
  • GitHub Check: Build and Test - dynamo
🔇 Additional comments (4)
deploy/inference-gateway/helm/dynamo-gaie/values.yaml (1)

51-58: Values align with Dynamo-first mode; minor note on kvBlockSize

Looks good and consistent with the new templates (extension.image, epp.useDynamo, epp.dynamo.*). Note: kvBlockSize here is "16" while the k8s manifest sets DYNAMO_KV_BLOCK_SIZE "128". Please reconcile to avoid confusion.

Also applies to: 59-78, 79-81

recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-configmap.yaml (1)

39-43: Verify frontendURL reachability

frontendURL is http://127.0.0.1:8000/v1/chat/completions. Ensure the target server runs in the same pod/localhost. If not, point this to the correct service/port.

recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/inference-model.yaml (1)

17-28: LGTM

Resource spec is consistent with the referenced pool and model metadata.

recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/epp-deployment.yaml (1)

71-75: KV block size mismatch

DYNAMO_KV_BLOCK_SIZE is "128" here, but helm values default to "16". Please align or document why they differ.

Comment on lines 24 to 27
# Per-recipe values
: "${RECIPE_VALUES_1:=model-gaie.yaml}"
: "${RECIPE_VALUES_2:=values-epp-aware.yaml}"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

🧩 Analysis chain

Update Helm flags to new values API (useDynamo + extension.image)

Chart no longer uses eppAware.*. Use epp.useDynamo and extension.image so EPP_IMAGE is honored. Also refresh example/comment and stale values filename.

Apply:

-: "${RECIPE_VALUES_2:=values-epp-aware.yaml}"
+: "${RECIPE_VALUES_2:=values.yaml}"  # optional per recipe

-#   ./deploy.sh --set eppAware.extraEnv[0].name=FOO --set eppAware.extraEnv[0].value=bar
+#   ./deploy.sh --set epp.extraEnv[0].name=FOO --set epp.extraEnv[0].value=bar

 helm upgrade --install "$RELEASE" "$CHART_DIR" \
   -n "$NAMESPACE" \
   "${VALUES_ARGS[@]}" \
-  --set eppAware.enabled=true \
-  --set-string eppAware.eppImage="$EPP_IMAGE" \
+  --set epp.useDynamo=true \
+  --set-string extension.image="$EPP_IMAGE" \
   "${EXTRA_ARGS[@]}"

Run to confirm no eppAware references remain in chart:

Also applies to: 66-83


🏁 Script executed:

#!/bin/bash
rg -nP --type=yaml 'eppAware' "$CHART_DIR"/templates "$CHART_DIR"/values.yaml || true

Length of output: 103


#!/bin/bash
# Locate CHART_DIR assignment in deploy.sh
rg -n 'CHART_DIR' -n recipes/llama-3-70b/vllm/agg/gaie/deploy.sh

# Find the Helm chart directory by locating Chart.yaml
find recipes/llama-3-70b -maxdepth 4 -type f -name Chart.yaml

# Derive chart directory and list its contents
chart_dir=$(find recipes/llama-3-70b -maxdepth 4 -type f -name Chart.yaml | head -n1 | xargs dirname)
echo "Chart dir: $chart_dir"
find "$chart_dir" -type f

# Search for any remaining 'eppAware' references
rg -nP 'eppAware' -C3 "$chart_dir"

🏁 Script executed:

#!/bin/bash
# 1. Show CHART_DIR assignment in deploy.sh
rg -n 'CHART_DIR' -n recipes/llama-3-70b/vllm/agg/gaie/deploy.sh

# 2. Find the Helm chart location
chart_file=$(find recipes/llama-3-70b -type f -name Chart.yaml | head -n1)
echo "Chart.yaml found at: $chart_file"
chart_dir=$(dirname "$chart_file")
echo "Chart directory: $chart_dir"

# 3. List chart contents
find "$chart_dir" -maxdepth 2 -type f

# 4. Search for any remaining 'eppAware' references
rg -nP 'eppAware' -C3 "$chart_dir"

Length of output: 3438


🏁 Script executed:

#!/bin/bash
# Extract chart path
chart_dir=$(rg -m1 -oP '(?<=CHART_DIR=").*(?=")' recipes/llama-3-70b/vllm/agg/gaie/deploy.sh)
echo "Chart dir: $chart_dir"

# List values files
echo "Values files:"
ls "$chart_dir"/*values*.yaml || true

# Search for deprecated eppAware
echo "eppAware references:"
rg -nP 'eppAware' -C3 "$chart_dir" || true

# Search for new flags
echo "epp.useDynamo references:"
rg -nP 'epp.useDynamo' -C3 "$chart_dir" || true
echo "extension.image references:"
rg -nP 'extension.image' -C3 "$chart_dir" || true

Length of output: 545


🏁 Script executed:

#!/bin/bash
chart_dir="deploy/inference-gateway/helm/dynamo-gaie"

echo "Chart dir exists: $( [ -d "$chart_dir" ] && echo yes || echo no )"
echo

echo "Values files in $chart_dir:"
ls "$chart_dir"/*values*.yaml || true
echo

echo "Template files in $chart_dir/templates:"
ls "$chart_dir/templates" || true
echo

echo "Deprecated 'eppAware' references:"
rg -nP 'eppAware' -C3 "$chart_dir" || true
echo

echo "New flags 'epp.useDynamo' references:"
rg -nP 'epp.useDynamo' -C3 "$chart_dir" || true
echo

echo "New flags 'extension.image' references:"
rg -nP 'extension.image' -C3 "$chart_dir" || true

Length of output: 4798


Update deploy.sh to use new Helm flags and default values file

Chart no longer uses eppAware.*. Switch to epp.useDynamo and extension.image, update example and default values file.

-: "${RECIPE_VALUES_2:=values-epp-aware.yaml}"
+: "${RECIPE_VALUES_2:=values.yaml}"

-#   ./deploy.sh --set eppAware.extraEnv[0].name=FOO --set eppAware.extraEnv[0].value=bar
+#   ./deploy.sh --set epp.extraEnv[0].name=FOO --set epp.extraEnv[0].value=bar

-  --set eppAware.enabled=true \
-  --set-string eppAware.eppImage="$EPP_IMAGE" \
+  --set epp.useDynamo=true \
+  --set-string extension.image="$EPP_IMAGE" \
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Per-recipe values
: "${RECIPE_VALUES_1:=model-gaie.yaml}"
: "${RECIPE_VALUES_2:=values-epp-aware.yaml}"
# Per-recipe values
: "${RECIPE_VALUES_1:=model-gaie.yaml}"
: "${RECIPE_VALUES_2:=values.yaml}"
# ./deploy.sh --set epp.extraEnv[0].name=FOO --set epp.extraEnv[0].value=bar
--set epp.useDynamo=true \
--set-string extension.image="$EPP_IMAGE" \

Comment on lines 64 to 77
env:
- name: ETCD_ENDPOINTS
value: "dynamo-platform-etcd.dynamo:2379" # UPDATE namespace if needed
- name: NATS_SERVER
value: "nats://dynamo-platform-nats.dynamo:4222" # UPDATE namespace if needed
- name: DYN_NAMESPACE
value: "llama3-70b-agg"
- name: DYNAMO_COMPONENT
value: "backend"
- name: DYNAMO_KV_BLOCK_SIZE
value: "128" # UPDATE if you change --block-size in your deploy.yaml engine command
- name: USE_STREAMING
value: "true"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make endpoints namespace-dynamic via Downward API; avoid hardcoding 'dynamo'

Use metadata.namespace and env expansion to keep ETCD/NATS correct across namespaces:

         env:
+          - name: NAMESPACE
+            valueFrom:
+              fieldRef:
+                fieldPath: metadata.namespace
           - name: ETCD_ENDPOINTS
-            value: "dynamo-platform-etcd.dynamo:2379"  # UPDATE namespace if needed
+            value: "dynamo-platform-etcd.$(NAMESPACE):2379"
           - name: NATS_SERVER
-            value: "nats://dynamo-platform-nats.dynamo:4222"  # UPDATE namespace if needed
+            value: "nats://dynamo-platform-nats.$(NAMESPACE):4222"
           - name: DYN_NAMESPACE
-            value: "llama3-70b-agg"
+            value: "llama3-70b-agg"  # Verify matches backend labels

This also lets you drop fragile kustomize replacements. Align DYN_NAMESPACE with labels used by backend pods and docs.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
env:
- name: ETCD_ENDPOINTS
value: "dynamo-platform-etcd.dynamo:2379" # UPDATE namespace if needed
- name: NATS_SERVER
value: "nats://dynamo-platform-nats.dynamo:4222" # UPDATE namespace if needed
- name: DYN_NAMESPACE
value: "llama3-70b-agg"
- name: DYNAMO_COMPONENT
value: "backend"
- name: DYNAMO_KV_BLOCK_SIZE
value: "128" # UPDATE if you change --block-size in your deploy.yaml engine command
- name: USE_STREAMING
value: "true"
env:
- name: NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: ETCD_ENDPOINTS
value: "dynamo-platform-etcd.$(NAMESPACE):2379"
- name: NATS_SERVER
value: "nats://dynamo-platform-nats.$(NAMESPACE):4222"
- name: DYN_NAMESPACE
value: "llama3-70b-agg" # Verify matches backend labels
- name: DYNAMO_COMPONENT
value: "backend"
- name: DYNAMO_KV_BLOCK_SIZE
value: "128" # UPDATE if you change --block-size in your deploy.yaml engine command
- name: USE_STREAMING
value: "true"

Comment on lines 27 to 57
# Replacements - automatically update namespace in DNS URLs
replacements:
# Replace namespace in ETCD_ENDPOINTS
- source:
kind: Deployment
name: llama3-70b-agg
fieldPath: metadata.namespace
targets:
- select:
kind: Deployment
name: llama3-70b-agg
fieldPaths:
- spec.template.spec.containers.[name=epp].env.[name=ETCD_ENDPOINTS].value
options:
delimiter: '.'
index: 1
# Replace namespace in NATS_SERVER
- source:
kind: Deployment
name: llama3-70b-agg
fieldPath: metadata.namespace
targets:
- select:
kind: Deployment
name: llama3-70b-agg
fieldPaths:
- spec.template.spec.containers.[name=epp].env.[name=NATS_SERVER].value
options:
delimiter: '.'
index: 1
# Replace namespace in -poolNamespace argument
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fragile replacements will strip ports from ETCD/NATS endpoints

Splitting on '.' with index 1 for:

  • ETCD_ENDPOINTS: "dynamo-platform-etcd.dynamo:2379"
  • NATS_SERVER: "nats://dynamo-platform-nats.dynamo:4222"

replaces token "dynamo:2379"/"dynamo:4222", dropping the port. Resulting values will miss ":2379"/":4222". Use Downward API + env var expansion instead and remove these two replacements.

Suggested approach:

  • In Deployment env, add:
    • NAMESPACE from metadata.namespace
    • ETCD_ENDPOINTS: "dynamo-platform-etcd.$(NAMESPACE):2379"
    • NATS_SERVER: "nats://dynamo-platform-nats.$(NAMESPACE):4222"
  • Keep the args.3 and ClusterRoleBinding subject replacements.

Also applies to: 63-79

🤖 Prompt for AI Agents
In recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/kustomization.yaml around
lines 27-57 (and also apply same change to lines 63-79), the current
replacements split endpoint strings on '.' which strips ports (e.g. ":2379" and
":4222"); remove the two replacement blocks that target ETCD_ENDPOINTS and
NATS_SERVER and instead modify the Deployment manifest to use the Downward API:
add an env var NAMESPACE from metadata.namespace and set ETCD_ENDPOINTS to
"dynamo-platform-etcd.$(NAMESPACE):2379" and NATS_SERVER to
"nats://dynamo-platform-nats.$(NAMESPACE):4222" so the namespace is injected
without string-splitting; leave the existing args.3 and ClusterRoleBinding
subject replacements unchanged.

Comment on lines 57 to 60
4. Ensure your Dynamo backend pods have the appropriate labels:
- `nvidia.com/dynamo-component: Frontend`
- `nvidia.com/dynamo-namespace: vllm-agg`

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Component label mismatch

You list:

  • nvidia.com/dynamo-component: Frontend

Deployment/env use component "backend". Align the label example accordingly to avoid routing issues.

🤖 Prompt for AI Agents
In recipes/llama-3-70b/vllm/agg/gaie/k8s-manifests/README.md around lines 57 to
60, the example uses the label nvidia.com/dynamo-component: Frontend which
conflicts with the deployment/environment that expects "backend"; update the
example label to nvidia.com/dynamo-component: backend so the Dynamo backend pods
are correctly labeled and will be routed to by the deployment.

Comment on lines +79 to +80
# Add --gaie argument to the script i.e.:
./recipes/run.sh --model llama-3-70b --framework vllm --gaie agg
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix GAIE run command path.

README examples above run from the recipes/ directory using ./run.sh, but this new GAIE snippet switches to ./recipes/run.sh, which will 404 when executed from the same directory. Keep the invocation consistent (i.e., ./run.sh … --gaie) so the command actually works.

🤖 Prompt for AI Agents
In recipes/README.md around lines 79-80, the GAIE example uses an incorrect path
"./recipes/run.sh" while surrounding examples run from the recipes/ directory
using "./run.sh"; update the snippet to use "./run.sh --model llama-3-70b
--framework vllm --gaie agg" so the invocation is consistent and will execute
correctly from the recipes/ directory.

Comment on lines +102 to 106
+ --gaie)
+ GAIE=true
+ shift
+ ;;
-h|--help)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Fix case item for --gaie (ShellCheck SC1073/SC1072/SC1085)

Ensure the case entry is well-formed and ends with ';;'. Example:

         --namespace)
             if [ "$2" ]; then
                 NAMESPACE=$2
                 shift 2
             else
                 missing_requirement "$1"
             fi
             ;;
-+       --gaie)
-+            GAIE=true
-+            shift
-+            ;;
+        --gaie)
+            GAIE=true
+            shift
+            ;;
🧰 Tools
🪛 Shellcheck (0.11.0)

[error] 102-102: Couldn't parse this case item. Fix to allow more checks.

(SC1073)


[error] 102-102: Expected ) to open a new case item. Fix any mentioned problems and try again.

(SC1072)


[error] 102-102: Did you forget to move the ;; after extending this case item?

(SC1085)

🤖 Prompt for AI Agents
In recipes/run.sh around lines 102 to 106, the case pattern for --gaie is
missing its terminating ';;', which makes the case clause malformed; add the
missing ';;' after the GAIE=true and shift lines so the case entry is
well-formed and closes properly (i.e., ensure the --gaie) block ends with ';;').

Comment on lines +223 to +233
if [[ "$INTEGRATION" == "gaie" ]]; then
if [[ -x "$INTEG_DEPLOY_SCRIPT" ]]; then
$DRY_RUN "$INTEG_DEPLOY_SCRIPT"
else
echo "Error: Expected executable '$INTEG_DEPLOY_SCRIPT' for GAIE integration."
echo "Hint: create $INTEG_DEPLOY_SCRIPT and make it executable (chmod +x)."
exit 1
fi
# For now do not run the benchmark
exit
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Pass NAMESPACE/EPP_IMAGE to GAIE deploy.sh; fail fast if missing

GAIE deploy.sh requires EPP_IMAGE and uses NAMESPACE. Since shell vars aren't exported by default, export/pass them so deploy.sh honors --namespace and image.

 if [[ "$INTEGRATION" == "gaie" ]]; then
     if [[ -x "$INTEG_DEPLOY_SCRIPT" ]]; then
-        $DRY_RUN "$INTEG_DEPLOY_SCRIPT"
+        if [[ -z "${EPP_IMAGE:-}" ]]; then
+            echo "Error: --epp-image or EPP_IMAGE env is required for GAIE."
+            exit 1
+        fi
+        # Ensure env visible to child
+        NAMESPACE="$NAMESPACE" EPP_IMAGE="$EPP_IMAGE" $DRY_RUN "$INTEG_DEPLOY_SCRIPT"
     else
         echo "Error: Expected executable '$INTEG_DEPLOY_SCRIPT' for GAIE integration."
         echo "Hint: create $INTEG_DEPLOY_SCRIPT and make it executable (chmod +x)."
         exit 1
     fi
     # For now do not run the benchmark
     exit
 fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if [[ "$INTEGRATION" == "gaie" ]]; then
if [[ -x "$INTEG_DEPLOY_SCRIPT" ]]; then
$DRY_RUN "$INTEG_DEPLOY_SCRIPT"
else
echo "Error: Expected executable '$INTEG_DEPLOY_SCRIPT' for GAIE integration."
echo "Hint: create $INTEG_DEPLOY_SCRIPT and make it executable (chmod +x)."
exit 1
fi
# For now do not run the benchmark
exit
fi
if [[ "$INTEGRATION" == "gaie" ]]; then
if [[ -x "$INTEG_DEPLOY_SCRIPT" ]]; then
if [[ -z "${EPP_IMAGE:-}" ]]; then
echo "Error: --epp-image or EPP_IMAGE env is required for GAIE."
exit 1
fi
# Ensure env visible to child
NAMESPACE="$NAMESPACE" EPP_IMAGE="$EPP_IMAGE" $DRY_RUN "$INTEG_DEPLOY_SCRIPT"
else
echo "Error: Expected executable '$INTEG_DEPLOY_SCRIPT' for GAIE integration."
echo "Hint: create $INTEG_DEPLOY_SCRIPT and make it executable (chmod +x)."
exit 1
fi
# For now do not run the benchmark
exit
fi
🤖 Prompt for AI Agents
In recipes/run.sh around lines 223 to 233, the GAIE deploy script requires
EPP_IMAGE and NAMESPACE but those variables are not exported/passed so the
deploy script may not see them; add a pre-check that fails fast if either
EPP_IMAGE or NAMESPACE is empty/unset and then invoke the deploy script with
those variables exported or prefixed (e.g. export EPP_IMAGE and NAMESPACE before
calling, or call with EPP_IMAGE="$EPP_IMAGE" NAMESPACE="$NAMESPACE" $DRY_RUN
"$INTEG_DEPLOY_SCRIPT") so the deploy.sh receives the intended values.

Signed-off-by: Anna Tchernych <[email protected]>
Signed-off-by: Anna Tchernych <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant