-
Notifications
You must be signed in to change notification settings - Fork 578
[FEA] Convert Dask into an optional dependency #5934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels
Comments
@beckernick the code in cuML already has this capability (it enables cuml-cpu), all the way to the import structure. The main challenge is making it super easy to users to install the correct versions of: dask, dask-cuda, raft-dask. There are a few ways we can tackle this problem that I can think off of the top of my head:
What are your thoughts about that @beckernick ? |
rapids-bot bot
pushed a commit
that referenced
this issue
Apr 29, 2025
This PR started because I noticed most of the functions in `import_utils.py` were effectively dead code. It spiraled out a bit from there to remove _almost all_ of `import_utils.py` in favor of: - Using `pytest.importorskip` to handle conditional imports in tests. This is the proper `pytest` pattern to do this, and is both easier to get correct and uses fewer lines of code. - Localized import checks (see `hdbscan.pyx`). Keeping the import check local to the module is easier to read IMO, and also helps avoid dead code accumulating in a global utils file. - No import checks. Things like `cupy` are required dependencies and don't need to be gated at all. The remaining functions will be removed in other PRs: - `has_dask` will go away once `dask` is made optional (see #5934) - `has_scipy` is removed in #6596 - `has_sklearn` will go away once `sklearn` is made a required dependency Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Simon Adorf (https://github.com/csadorf) URL: #6599
Ofek-Haim
pushed a commit
to Ofek-Haim/cuml
that referenced
this issue
May 13, 2025
This PR started because I noticed most of the functions in `import_utils.py` were effectively dead code. It spiraled out a bit from there to remove _almost all_ of `import_utils.py` in favor of: - Using `pytest.importorskip` to handle conditional imports in tests. This is the proper `pytest` pattern to do this, and is both easier to get correct and uses fewer lines of code. - Localized import checks (see `hdbscan.pyx`). Keeping the import check local to the module is easier to read IMO, and also helps avoid dead code accumulating in a global utils file. - No import checks. Things like `cupy` are required dependencies and don't need to be gated at all. The remaining functions will be removed in other PRs: - `has_dask` will go away once `dask` is made optional (see rapidsai#5934) - `has_scipy` is removed in rapidsai#6596 - `has_sklearn` will go away once `sklearn` is made a required dependency Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Simon Adorf (https://github.com/csadorf) URL: rapidsai#6599
rapids-bot bot
pushed a commit
that referenced
this issue
May 23, 2025
This is split out from #6668 and contains just the code changes. After this PR `dask`, `distributed`, and friends won't be imported on `import cuml`, they'll only be imported after `import cuml.dask`. This speeds up import time, simplifies our codebase a bit, and should be a non-controversial step towards #5934. Authors: - Jim Crist-Harif (https://github.com/jcrist) Approvers: - Tim Head (https://github.com/betatim) - Dante Gama Dessavre (https://github.com/dantegd) URL: #6788
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
Dask is the primary runtime through which cuML provides multi-GPU functionality in Python. As a result, it's historically been a required cuML dependency. This is a non-issue for users who want the multi-GPU capabilities in cuML.
However, for users who only want to use cuML for single GPU use cases, this can cause unnecessary dependency alignment challenges in their environments -- because it's common practice to tightly pin Dask versions to avoid breakages in stable releases and platforms. We don't expect this behavior to change.
We should explore converting Dask into an optional dependency only necessary if someone attempts to use the Dask multi-GPU capabilities in cuML.
The text was updated successfully, but these errors were encountered: