Skip to content

Make it possible to manually trigger nightly tests based on a branch build #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jameslamb opened this issue Jun 4, 2025 · 5 comments
Assignees
Labels
feature request New feature or request

Comments

@jameslamb
Copy link
Member

jameslamb commented Jun 4, 2025

Description

On branch-25.08 across RAPIDS, as of this writing if you try to do the following:

  1. merge a PR
  2. manually trigger the test.yaml workflow, setting RAPIDS_SHA to the commit from that just-merged PR

The wrong CI artifacts will be downloaded, which means the tests will not actually be testing the intended state of the code, and might even outright fail.

It should be possible to do this.

Benefits of this work

Makes it possible to get a successful run of the nightly tests after merging relevant fixes, without having to wait for the next scheduled run.

Benefits of this:

Acceptance Criteria

  • it is possible to manually trigger the nightly test workflow (test.yaml) and have it use artifacts produced from a recent branch build

Approach

Let's start with what "wrong CI artifacts will be downloaded" means.

When RAPIDS CI workloads run, they need to find the relevant just-built-in-CI conda packages or wheels to download.
That's done via rapids-github-run-id (rapidsai/gha-tools/tools/rapids-github-run-id).

It's important that this resolve to exactly 1 run ID, to uniquely identify a single artifact to download.

branch and nightly builds can be identical on most of the characteristics that GitHub CLI allows you to filter workflow runs by:

  • repo
  • commit SHA
  • branch
  • workflow file (by convention, build.yaml)

So today, rapids-github-run-id has logic like "build.yaml runs triggered by a push must be branch builds, build.yaml runs triggered by a workflow_dispatch must be nightly build"

That is fine under normal operation, by convention:

But in the scenario described in this issue, there is not a workflow_dispatch-triggered build.yaml run for that just-created commit SHA, which leads to rapids-github-run-id not find any runs!

Until rapidsai/gha-tools#192 is fixed, that means that the latest artifacts produced from any successful build.yaml run on that branch will be used. After that is fixed, this will result in a loud "could not find a run ID" type of error.

@ajschmidt8 and I talked about the fragility of relying on the event type for distinguishing between build types here: rapidsai/gha-tools#164 (comment)

I think the most reliable way to support this pattern is to do as @ajschmidt8 suggested there... make it possible to circumvent all of this by directly providing a run ID as an input when triggering test.yaml workflows.

This could look something like this:

  • make rapids-github-run-id look for an env variable like RAPIDS_GITHUB_RUN_ID, and just return that if it's present and non-empty
  • add an input to all the test.yaml workflows allowing users to specify that on calls

Notes

N/A

@jameslamb jameslamb added the feature request New feature or request label Jun 4, 2025
@jameslamb jameslamb self-assigned this Jun 10, 2025
@jameslamb
Copy link
Member Author

Some other design ideas that came out of an offline discussion with @bdice

  • nightly test runs could install from rapidsai-nightly / nightly PyPI instead of downloading CI artifacts
  • rapids-github-run-id could be modified to not differentiate between branch and nightly builds, and just take like "the latest build.yaml run on branch {branch} at commit {sha}"
    • (would need to think about the implications of that though)

@bdice
Copy link
Contributor

bdice commented Jun 10, 2025

nightly test runs could install from rapidsai-nightly / nightly PyPI instead of downloading CI artifacts

This might be fragile, as it introduces a weird race condition with PRs that might have been merged around the same time as nightly builds. It could be nice to have some way to use rapidsai-nightly / nightly PyPI in a manually triggered test job, but it's probably not a hard requirement.

rapids-github-run-id could be modified to not differentiate between branch and nightly builds, and just take like "the latest build.yaml run on branch {branch} at commit {sha}"

I really just want this for branch builds, not nightlies.

The primary need I have is a way to fix check-nightly-ci. Once we admin-merge a PR that is supposed to fix CI, we need a way to manually run test.yaml against the builds made from merging that PR, to unblock check-nightly-ci.

@vyasr
Copy link
Contributor

vyasr commented Jun 10, 2025

I'm confused. What would resolving this issue allow that we cannot already do by specifying a build_type of branch in test.yaml, i.e. why isn't rapidsai/build-planning#147 enough? Is the issue that the switch to Github artifacts from downloads.rapids.ai broke this functionality?

@bdice
Copy link
Contributor

bdice commented Jun 10, 2025

Yes, artifacts broke the previous functionality. They tie artifacts to a run ID, rather than a commit.

@vyasr
Copy link
Contributor

vyasr commented Jun 10, 2025

OK got it so we've gone back to having the same problem that we used to as a result. That makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants