-
-
Notifications
You must be signed in to change notification settings - Fork 548
Close CUDA 11.8 migration #5340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Close CUDA 11.8 migration #5340
Conversation
Now that a large majority of feedstocks have been rebuilt with CUDA 11.8, go ahead and close out the CUDA 11.8 migration and add CUDA 11.8 to the matrix.
As CUDA 11.8 is now included by default, conda-smithy will try sort other CUDA versions relative to 11.8. However the sort order in the CUDA 12.0 migrator didn't include CUDA 11.8 before. As a result, conda-smithy will error when trying to sort CUDA 11.8. So add CUDA 11.8 to the sort order in the CUDA 12.0 migrator. This shouldn't effect the CUDA 12.0 migration itself; however, it may mean that users will need to refresh their copy of the CUDA 12.0 migrator as well.
Hi! This is the friendly automated conda-forge-linting service. I just wanted to let you know that I linted all conda-recipes in your PR ( |
@@ -51,6 +63,7 @@ __migrator: | |||
- 11.0 # [(linux64 or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"] | |||
- 11.1 # [(linux64 or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"] | |||
- 11.2 # [(linux or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"] | |||
- 11.8 # [(linux or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When applying these changes to feedstocks with the CUDA 12.0 migrator, found that re-rendering would error due to the ordering
not including 11.8. So have added it here
However this means feedstocks would run into this issue as well. With these changes they would need to refresh the CUDA 12.0 migrator. Alternatively we could restart this migrator to ensure the new YAML is added. Though that feels like unnecessary extra effort for maintainers. Not sure if there are better options
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
However this means feedstocks would run into this issue as well. With these changes they would need to refresh the CUDA 12.0 migrator.
If you update it here in the pinning repo, every feedstock that has .ci_support/migrations/cuda120.yaml
will use that updated variant unless they set use_local: true
. No need to restart anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the feedstocks automatically pick up this change, great!
Had the impression that was not the case, but would be happy to be wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep you are right
Confirmed by testing this locally with conda-smithy
's --exclusive-config-file
flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have since done a few re-renders on feedstocks with both the CUDA 11.8 & 12 migrator included without issues
For the use_local: true
case, searched through the feedstocks and found a handful that were affected. Submitted PRs to them to refresh their CUDA 12 migrators so that re-render works. Added some background in the PRs for context
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the use_local: true
workaround was added by feedstocks working around a CUDA 12 + GCC 12 + pybind11 bug ( pybind/pybind11#4606 ), pybind11 made a fix ( pybind/pybind11@3414c56 ), which was released in pybind11
version 2.12.0
. Coincidentally the same version that included the NumPy 2 fixes that we also need ( pybind/pybind11@705efcc ). This newer version of pybind11
was packaged in conda-forge earlier this year ( conda-forge/pybind11-feedstock#94 ). So for feedstocks using pybind11
directly, simply getting the latest version in conda-forge was sufficient for them to drop these workarounds
However the other issue we had was feedstocks that use PyTorch got pybind11
from a vendored copy in PyTorch (not the package). This comes up when a package #include
s PyTorch's headers. Fortunately PyTorch upgraded its version of pybind11 to 2.12.0 as part of their NumPy 2 upgrade effort ( pytorch/pytorch@6c2f36c ). This was released in PyTorch version 2.4.0, which was recently packaged in conda-forge ( conda-forge/pytorch-cpu-feedstock#250 ). This allowed the remaining workarounds to be dropped
While some of these workarounds were removed organically over time, a few were still around recently. So have worked with maintainers to clean these out. At this point, I see none of the original vendored CUDA 12 migrators
There are other cases where the CUDA 12 migrator is vendored. However this is usually done in packages where they build against the latest CUDA 12 and can continue to support older CUDA 12's. So this is irrelevant in this context
Mentioning here as the other context is in this thread or xref'd below. So this seemed like the natural place to include this information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're commenting on the wrong PR...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment relates to both PRs, but felt more natural here given more relevant context here
Also wanted to reference it in the other PR as done in comment: #6263 (comment)
Assuming that was the PR you had in mind
@conda-forge/core any more thoughts on this? |
Thanks all! 🙏 |
Following up on the previous discussion on the Dec 13 conda-forge meeting, this closes out the CUDA 11.8 migration and adds it to matrix
At this point
77%83% of feedstocks are migrated and another19%13% have an open PR to add CUDA 11.8, which accounts for 96% of feedstocks. So this is sufficiently close to doneCurrently this keeps in CUDA 11.2. Though we can consider dropping this in the future as well. Have started a separate discussion in issue ( #5339 )
Edit: Updated to include the latest migration statistics