Forward-merge branch-25.02 into branch-25.04#17825
Closed
rapids-bot[bot] wants to merge 38 commits intobranch-25.04from
Closed
Forward-merge branch-25.02 into branch-25.04#17825rapids-bot[bot] wants to merge 38 commits intobranch-25.04from
rapids-bot[bot] wants to merge 38 commits intobranch-25.04from
Conversation
…fea/bump-polars-version
This PR upgrades the upper bound pinnings for `pyarrow` in `cudf`. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17794
Contributor
Author
|
FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the |
Fixes: #17775 This PR fixes a race condition that arises when `disable_module_accelerator` is used in a multi-threaded setting. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) URL: #17811
Contributes to #7795 This PR addressed most of the relaxed constexpr in cuIO. Authors: - Yunsong Wang (https://github.com/PointKernel) Approvers: - Basit Ayantunde (https://github.com/lamarrr) - Vukasin Milovanovic (https://github.com/vuule) URL: #17746
Contributes to rapidsai/build-planning#136 For nightly builds, some `wheel-build-{project}` jobs currently wait to start until some other `wheel-publish-{dependency}` jobs complete. This is unnecessary... `wheel-build-{dependency}` jobs will upload packages to S3, which is where `wheel-build-{project}` jobs will download them from. This proposes changing that such that all nightly `wheel-build-*` jobs depend only other `wheel-build-*` jobs. This should decrease the end-to-end time it takes for all wheels to be built and published on nightly / branch builds. Also updates `pre-commit` config to the latest `rapids-dependency-file-generator` version. Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Kyle Edwards (https://github.com/KyleFromNVIDIA) URL: #17792
## Description This PR fixes `pre-commit.ci` failures. ## Checklist - [x] I am familiar with the [Contributing Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md). - [x] New or existing tests cover these changes. - [x] The documentation is up to date with these changes. Co-authored-by: Vyas Ramasubramani <vyasr@nvidia.com>
…e string cols with chunked parquet reader (#17702) Closes #17692. This PR enables computing the `str_offset` required to correctly compute the offsets columns for nested large strings columns with chunked Parquet reader when `chunk_read_limit` is small resulting in multiple output table chunks per subpass. Authors: - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Ed Seidl (https://github.com/etseidl) - Vukasin Milovanovic (https://github.com/vuule) URL: #17702
Bump polars version to 1.20
A small new string feature. Authors: - Lawrence Mitchell (https://github.com/wence-) Approvers: - Matthew Murray (https://github.com/Matt711) - Vyas Ramasubramani (https://github.com/vyasr) URL: #17755
This PR applies `ruff` (`check` and `format`) everywhere, including notebooks and utility scripts. This allows us to drop our use of `nbqa`, since `ruff` natively supports notebooks. (xref: #17819, #17805) I manually updated a few notebooks that were using old NumPy syntax for generating random values. Closes #17461. I also updated the `ruff` version to 0.9.3. Authors: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Matthew Murray (https://github.com/Matt711) URL: #17820
Before embarking on more rolling window performance optimizations and code changes, let's introduce some new benchmarks: - demonstrating bad algorithmic behavior of large window rolling aggregations; - of the range-based rolling interface. Authors: - Lawrence Mitchell (https://github.com/wence-) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Bradley Dice (https://github.com/bdice) - Vukasin Milovanovic (https://github.com/vuule) URL: #17787
There is a timeout failure in nightly tests: https://github.com/rapidsai/cudf/actions/runs/12983287834/job/36204344253 It looks like CI runs can get very slow at times, hence bumping up the timeout. This test basically guards us to test against a hang, so 20s timeout should be good too. Authors: - GALI PREM SAGAR (https://github.com/galipremsagar) Approvers: - Matthew Murray (https://github.com/Matt711) URL: #17829
…pat (#17822) closes #17786 Authors: - Matthew Roeschke (https://github.com/mroeschke) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - GALI PREM SAGAR (https://github.com/galipremsagar) - Lawrence Mitchell (https://github.com/wence-) URL: #17822
The previous [strings PR](#17286) significantly reduced the parquet reader string performance for very-long strings, for lengths ~1024 and longer. This PR fixes the performance issue by instituting a max memcpy length of 8 bytes at once (this length yielded best perf). Also, up to all of the threads in the block can work on the same string, rather than limiting it to just all of the threads in a warp. **PERFORMANCE:** Short strings: Unchanged Length 1024: 25% faster Longer lengths (up to 64k): Up to 90% faster, same as before strings PR Authors: - Paul Mattione (https://github.com/pmattione-nvidia) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17773
…ader (#17708) Closes #17689 This PR resolves a bug in the multi-batch JSON reader, wherein the reader was throwing an error when the column schema for any two partial tables from different batches did not match. We now enforce the column ordering in the first partial table i.e. the table returned by the first batch in all succeeding batches. The test added passes three string as three separate batches to the reader by setting the batch size to that of the first string. Authors: - Shruti Shivakumar (https://github.com/shrshi) - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Vukasin Milovanovic (https://github.com/vuule) - Karthikeyan (https://github.com/karthikeyann) - Paul Mattione (https://github.com/pmattione-nvidia) URL: #17708
Contributor
|
Closing so that the bot can open a new PR for the next set of changes now that #17828 is merged with most of the commits. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Forward-merge triggered by push to branch-25.02 that creates a PR to keep branch-25.04 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.