Skip to content

Close CUDA 11.8 migration #5340

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 11, 2024

Conversation

jakirkham
Copy link
Member

@jakirkham jakirkham commented Jan 4, 2024

Following up on the previous discussion on the Dec 13 conda-forge meeting, this closes out the CUDA 11.8 migration and adds it to matrix

At this point 77% 83% of feedstocks are migrated and another 19% 13% have an open PR to add CUDA 11.8, which accounts for 96% of feedstocks. So this is sufficiently close to done

Currently this keeps in CUDA 11.2. Though we can consider dropping this in the future as well. Have started a separate discussion in issue ( #5339 )

Edit: Updated to include the latest migration statistics

Now that a large majority of feedstocks have been rebuilt with CUDA
11.8, go ahead and close out the CUDA 11.8 migration and add CUDA 11.8
to the matrix.
As CUDA 11.8 is now included by default, conda-smithy will try sort
other CUDA versions relative to 11.8. However the sort order in the CUDA
12.0 migrator didn't include CUDA 11.8 before. As a result, conda-smithy
will error when trying to sort CUDA 11.8. So add CUDA 11.8 to the sort
order in the CUDA 12.0 migrator. This shouldn't effect the CUDA 12.0
migration itself; however, it may mean that users will need to refresh
their copy of the CUDA 12.0 migrator as well.
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@@ -51,6 +63,7 @@ __migrator:
- 11.0 # [(linux64 or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"]
- 11.1 # [(linux64 or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"]
- 11.2 # [(linux or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"]
- 11.8 # [(linux or win64) and os.environ.get("CF_CUDA_ENABLED", "False") == "True"]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When applying these changes to feedstocks with the CUDA 12.0 migrator, found that re-rendering would error due to the ordering not including 11.8. So have added it here

However this means feedstocks would run into this issue as well. With these changes they would need to refresh the CUDA 12.0 migrator. Alternatively we could restart this migrator to ensure the new YAML is added. Though that feels like unnecessary extra effort for maintainers. Not sure if there are better options

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However this means feedstocks would run into this issue as well. With these changes they would need to refresh the CUDA 12.0 migrator.

If you update it here in the pinning repo, every feedstock that has .ci_support/migrations/cuda120.yaml will use that updated variant unless they set use_local: true. No need to restart anything.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the feedstocks automatically pick up this change, great!

Had the impression that was not the case, but would be happy to be wrong

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep you are right

Confirmed by testing this locally with conda-smithy's --exclusive-config-file flag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have since done a few re-renders on feedstocks with both the CUDA 11.8 & 12 migrator included without issues

For the use_local: true case, searched through the feedstocks and found a handful that were affected. Submitted PRs to them to refresh their CUDA 12 migrators so that re-render works. Added some background in the PRs for context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the use_local: true workaround was added by feedstocks working around a CUDA 12 + GCC 12 + pybind11 bug ( pybind/pybind11#4606 ), pybind11 made a fix ( pybind/pybind11@3414c56 ), which was released in pybind11 version 2.12.0. Coincidentally the same version that included the NumPy 2 fixes that we also need ( pybind/pybind11@705efcc ). This newer version of pybind11 was packaged in conda-forge earlier this year ( conda-forge/pybind11-feedstock#94 ). So for feedstocks using pybind11 directly, simply getting the latest version in conda-forge was sufficient for them to drop these workarounds

However the other issue we had was feedstocks that use PyTorch got pybind11 from a vendored copy in PyTorch (not the package). This comes up when a package #includes PyTorch's headers. Fortunately PyTorch upgraded its version of pybind11 to 2.12.0 as part of their NumPy 2 upgrade effort ( pytorch/pytorch@6c2f36c ). This was released in PyTorch version 2.4.0, which was recently packaged in conda-forge ( conda-forge/pytorch-cpu-feedstock#250 ). This allowed the remaining workarounds to be dropped

While some of these workarounds were removed organically over time, a few were still around recently. So have worked with maintainers to clean these out. At this point, I see none of the original vendored CUDA 12 migrators

There are other cases where the CUDA 12 migrator is vendored. However this is usually done in packages where they build against the latest CUDA 12 and can continue to support older CUDA 12's. So this is irrelevant in this context

Mentioning here as the other context is in this thread or xref'd below. So this seemed like the natural place to include this information

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're commenting on the wrong PR...?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment relates to both PRs, but felt more natural here given more relevant context here

Also wanted to reference it in the other PR as done in comment: #6263 (comment)

Assuming that was the PR you had in mind

@h-vetinari h-vetinari mentioned this pull request Jan 6, 2024
@jakirkham jakirkham marked this pull request as ready for review January 6, 2024 05:12
@jakirkham jakirkham requested a review from a team as a code owner January 6, 2024 05:12
@jakirkham
Copy link
Member Author

@conda-forge/core any more thoughts on this?

@jakirkham jakirkham merged commit cbccb41 into conda-forge:main Jan 11, 2024
@jakirkham jakirkham deleted the close_cuda118 branch January 11, 2024 21:27
@jakirkham
Copy link
Member Author

Thanks all! 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants