chore: added CI model layer caching #168

jcabrero · 2025-11-14T09:58:29Z

No description provided.

Copilot

Pull Request Overview

This PR adds model layer caching to the CI/CD pipeline to avoid re-downloading large language models during each build, improving build times and reducing bandwidth usage.

Key Changes:

Added Docker BuildKit registry cache support for all component images
Implemented optional model pre-downloading in the vLLM Dockerfile during build time
Configured GitHub Container Registry as the cache backend with automatic login

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File	Description
docker/vllm.Dockerfile	Added build arguments and RUN command to optionally pre-download HuggingFace models during image build for layer caching
.github/workflows/cicd.yml	Added Docker Buildx setup, GHCR authentication, registry cache configuration, and vLLM-specific model caching build arguments

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2025-11-14T10:02:32Z

.github/workflows/cicd.yml

+          # Add model caching for vllm component
+          EXTRA_BUILD_ARGS=""
+          if [ "${{ matrix.component }}" = "vllm" ] && [ -n "${{ matrix.model_to_cache || '' }}" ]; then
+            EXTRA_BUILD_ARGS="--build-arg MODEL_TO_CACHE=${{ matrix.model_to_cache }} --build-arg HF_TOKEN=${{ secrets.HF_TOKEN }}"


The HF_TOKEN build argument should be passed as a secret mount instead of a regular build argument to avoid exposing the token in build history and cached layers. Use --secret id=hf_token,env=HF_TOKEN in the docker build command and access it via RUN --mount=type=secret,id=hf_token in the Dockerfile for better security.

Copilot · 2025-11-14T10:02:32Z

docker/vllm.Dockerfile

+RUN if [ -n "$MODEL_TO_CACHE" ]; then \
+        echo "Pre-downloading model: $MODEL_TO_CACHE"; \
+        export HF_TOKEN="${HF_TOKEN}"; \
+        python3 -c "from huggingface_hub import snapshot_download; snapshot_download('$MODEL_TO_CACHE', cache_dir='/root/.cache/huggingface')"; \


Missing error handling for model download failures. If snapshot_download fails (e.g., due to network issues, invalid token, or non-existent model), the build will fail without a clear error message. Consider adding error handling with set -e and a trap or wrapping the Python command to provide a more informative error message.

Suggested change

python3 -c "from huggingface_hub import snapshot_download; snapshot_download('$MODEL_TO_CACHE', cache_dir='/root/.cache/huggingface')"; \

python3 -c "from huggingface_hub import snapshot_download; snapshot_download('$MODEL_TO_CACHE', cache_dir='/root/.cache/huggingface')" \

|| { echo >&2 "ERROR: Failed to pre-download model '$MODEL_TO_CACHE'. Check your network connection, HF_TOKEN, and model name."; exit 1; }; \

Copilot · 2025-11-14T10:02:32Z

docker/vllm.Dockerfile

+RUN if [ -n "$MODEL_TO_CACHE" ]; then \
+        echo "Pre-downloading model: $MODEL_TO_CACHE"; \
+        export HF_TOKEN="${HF_TOKEN}"; \


The HF_TOKEN is passed as a build argument, which means it will be visible in the image history and cached layers. This is a security concern as the token could be exposed. Consider using Docker BuildKit secrets with --mount=type=secret instead to securely pass the token without leaving it in the image layers.

Copilot · 2025-11-14T10:02:33Z

.github/workflows/cicd.yml

+            --cache-from=${CACHE_FROM} \
+            --cache-to=${CACHE_TO} \


The cache-to operation requires packages: write permission to push cache layers to GitHub Container Registry. The workflow only has contents: read and id-token: write permissions at the top level. While GITHUB_TOKEN is used for authentication, the workflow-level permissions may prevent successful cache push, causing the cache layer upload to fail silently or with permission errors.

jcabrero requested a review from Copilot November 14, 2025 09:58

Copilot started reviewing on behalf of jcabrero November 14, 2025 09:59 View session

Copilot finished reviewing on behalf of jcabrero November 14, 2025 10:01

Copilot AI reviewed Nov 14, 2025

View reviewed changes

chore: added CI model layer caching

fb8b9a1

jcabrero force-pushed the chore/ci_model_caching branch from c09d912 to fb8b9a1 Compare November 14, 2025 10:07

jcabrero added 4 commits November 14, 2025 11:55

fix: removed secrets from build

0bc77ce

feat: cache runner model

2ed95a1

feat: improvements

796fca0

feat: added buildx

83f82c0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

chore: added CI model layer caching #168

chore: added CI model layer caching #168

Uh oh!

jcabrero commented Nov 14, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Copilot AI Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	python3 -c "from huggingface_hub import snapshot_download; snapshot_download('$MODEL_TO_CACHE', cache_dir='/root/.cache/huggingface')"; \
	python3 -c "from huggingface_hub import snapshot_download; snapshot_download('$MODEL_TO_CACHE', cache_dir='/root/.cache/huggingface')" \
	\|\| { echo >&2 "ERROR: Failed to pre-download model '$MODEL_TO_CACHE'. Check your network connection, HF_TOKEN, and model name."; exit 1; }; \

chore: added CI model layer caching #168

Are you sure you want to change the base?

chore: added CI model layer caching #168

Uh oh!

Conversation

jcabrero commented Nov 14, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Nov 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants