-
Notifications
You must be signed in to change notification settings - Fork 146
Refactor artifact fetch and install scripts. #772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| base_artifacts = [ | ||
| "core-runtime_run", | ||
| "core-runtime_lib", | ||
| "sysdeps_lib", | ||
| "base_lib", | ||
| "amd-llvm_run", | ||
| "amd-llvm_lib", | ||
| "core-hip_lib", | ||
| "core-hip_dev", | ||
| "rocprofiler-sdk_lib", | ||
| "host-suite-sparse_lib", | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah... this is a workaround for us not having the artifact dependency graph modeled.
When using these scripts locally, I typically want "everything", maybe excluding certain components (e.g. tests, docs) and maybe excluding certain names (e.g. miopen).
When running on CI for tests of an individual project, we want that project's test artifacts and whatever they depend on.
Splitting this change off from #772. * Replaced tuple with `ArtifactDownloadRequest` dataclass. We can more easily attach metadata to this, use it in unit tests, and pass it between more functions without needing to remember what the tuple represents. * Renamed some functions and variables to reflect what they _do_, not how they are implemented. This makes it easier to change implementations later without also needing to change the function names.
Splitting this change off from #772. This will make it easier to work with artifacts that are on S3 and other non-local locations, where we'll have a URL and a filename, not a Path. We can also see about using [`cloudpathlib`](https://cloudpathlib.drivendata.org/stable/), "A Python library with classes that mimic `pathlib.Path`'s interface for URIs from different cloud storage services".
Splitting off more changes from #772. I'm hoping to reuse this script for local rocm Python wheel building, particularly on Windows: #827, so I'll have a few more refactors coming soon. We should also be able to use the script on Linux after some workflow refactors here: https://github.com/ROCm/TheRock/blob/fc8767523fc639a17ba2fd9a5b5137eca334617f/.github/workflows/release_portable_linux_packages.yml#L136-L155 See also discussion on #779.
I have more ideas for how to refactor here (see #772), but this is a reasonably non-invasive way to download all artifacts from a workflow run without affecting existing workflows that only want to download a subset. A few things I'm grappling with on the design side here, which might warrant a new script that could replace this one: * List available workflow runs, release builds, etc. (from S3 directories, an index page, or the github API?) * Enumerate files available for a given run (artifacts archives, tarballs, python wheels) * Filter files (based on target, category like "run" vs "test" vs "doc"), optionally tracking dependencies * Download files (staging in temp dirs? caching?) * Extract files (as needed), optionally deleting the originals if not cached * Install files (into a venv for wheels)
|
I've landed a few parts of this in smaller commits already. Closing stale PR. |
My main motivation here is making it easier to run either
fetch_artifacts.pyorinstall_rocm_from_artifacts.pyon developer machines to pull down larger, more complete sets of artifacts. The existing code had an explicit list list of artifact names and component types that would be downloaded, with special case handling for "base only" and "no tests".Now,
ArtifactNameutility classArtifactNameutility classWhile I was modifying the files, I also made these changes too:
--verboseand--dry-run--build-dirto--output-dirThis change couldn't easily be split into multiple commits due to how interconnected these scripts are and how some documentation is repetitive. I've at least added a few new unit tests so future changes can be made with more confidence. I could split into smaller PRs on request, though many changes overlap.