Skip to content

Conversation

@Harry-Chen
Copy link

@Harry-Chen Harry-Chen commented Nov 28, 2025

Purpose

vllm's nightly build needs to be renovated due to the following reasons:

  • Links to aarch64 wheels are totally broken.
  • Binary wheels are being duplicated everywhere, up to 6 (six) times:
    • /<commit>/, /nightly/, /<version>/ directory
    • its own version, and a hardcoded 1.0.0.dev wheel for the sole purpose of finding a precompiled wheel
  • Only one variant out of all (cu129, cu130, cpu) has its index; others are totally ignored.

In this PR, I have renovated the whole process. This includes:

  • Rewrite a Python script (generate-nightly-index.py) to elegantly handle the generation of indices and an extra metadata.json. It supports auto-creation of sub-indices for different variants (with automatic detection). Please read the comments in the code for a detailed explanation.
  • Rewrite upload-wheels.sh to upload one binary wheel only once after each successful build of any wheel. It will call generate-nightly-index.py to generate the index for all currently present wheels in the directory, and copy the indices to all necessary locations (e.g. /<commit>/, /nightly/ if it is on the master branch, and /<version>/ if it is not a dev version).
    • breaking change: no more hardcoded 1.0.0.dev wheels are uploaded to S3.
    • nits: the wrongly marked manylinux1 and manylinux2014 are corrected with manylinux_2_31, which reflects the glibc version of vllm's building image (ubuntu-20.04)
  • The logic in setup.py is changed accordingly to download the metadata.json to find the actual wheel path, not using the hardcoded 1.0.0.dev anymore.

More nits:

  • CUDA 12.8 build is removed from CI, as per the discussion with @youkaichao.
  • VLLM_MAIN_CUDA_VERSION is bumped to 12.9 to avoid confusion.

Test Plan

It's all CI changes. Let's test it by CI.

Test Result

  • release-pipeline has passed.
  • test-pipeline will probably fail on python_only_compile.sh, which tests VLLM_USE_PRECOMPILED with build name nightly. However, before this PR is merged to main, no new indices and metadata will be uploaded to /nightly/.

Essential Elements of an Effective PR Description Checklist
  • The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
  • The test plan, such as providing test command.
  • The test results, such as pasting the results comparison before and after, or e2e results
  • (Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
  • (Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

@mergify mergify bot added the ci/build label Nov 28, 2025
@Harry-Chen Harry-Chen force-pushed the nightly-wheel-reno branch 2 times, most recently from d8580d1 to 7116066 Compare November 29, 2025 02:09
@Harry-Chen Harry-Chen marked this pull request as ready for review November 29, 2025 05:34
@chatgpt-codex-connector
Copy link

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant