Skip to content

Forward-merge branch-25.02 into branch-25.04#17825

Closed
rapids-bot[bot] wants to merge 38 commits intobranch-25.04from
branch-25.02
Closed

Forward-merge branch-25.02 into branch-25.04#17825
rapids-bot[bot] wants to merge 38 commits intobranch-25.04from
branch-25.02

Conversation

@rapids-bot
Copy link
Contributor

@rapids-bot rapids-bot bot commented Jan 25, 2025

Forward-merge triggered by push to branch-25.02 that creates a PR to keep branch-25.04 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

@rapids-bot rapids-bot bot requested review from a team as code owners January 25, 2025 17:11
@rapids-bot rapids-bot bot requested a review from bdice January 25, 2025 17:11
@rapids-bot
Copy link
Contributor Author

rapids-bot bot commented Jan 25, 2025

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

@rapids-bot rapids-bot bot requested a review from Matt711 January 25, 2025 17:11
@github-actions github-actions bot added Python Affects Python cuDF API. pylibcudf Issues specific to the pylibcudf package labels Jan 25, 2025
Fixes: #17775 

This PR fixes a race condition that arises when `disable_module_accelerator` is used in a multi-threaded setting.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)

URL: #17811
@github-actions github-actions bot added the cudf.pandas Issues specific to cudf.pandas label Jan 25, 2025
Contributes to #7795

This PR addressed most of the relaxed constexpr in cuIO.

Authors:
  - Yunsong Wang (https://github.com/PointKernel)

Approvers:
  - Basit Ayantunde (https://github.com/lamarrr)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17746
@rapids-bot rapids-bot bot requested a review from a team as a code owner January 26, 2025 18:31
@rapids-bot rapids-bot bot requested a review from lamarrr January 26, 2025 18:31
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jan 26, 2025
@davidwendt davidwendt assigned GPUtester and unassigned GPUtester Jan 26, 2025
Matt711 and others added 2 commits January 27, 2025 08:05
Contributes to rapidsai/build-planning#136

For nightly builds, some `wheel-build-{project}` jobs currently wait to start until some other `wheel-publish-{dependency}` jobs complete. This is unnecessary... `wheel-build-{dependency}` jobs will upload packages to S3, which is where `wheel-build-{project}` jobs will download them from.

This proposes changing that such that all nightly `wheel-build-*` jobs depend only other `wheel-build-*` jobs. This should decrease the end-to-end time it takes for all wheels to be built and published on nightly / branch builds.

Also updates `pre-commit` config to the latest `rapids-dependency-file-generator` version.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Kyle Edwards (https://github.com/KyleFromNVIDIA)

URL: #17792
@rapids-bot rapids-bot bot requested a review from a team as a code owner January 27, 2025 16:07
Matt711 and others added 3 commits January 27, 2025 11:08
## Description
This PR fixes `pre-commit.ci` failures.

## Checklist
- [x] I am familiar with the [Contributing
Guidelines](https://github.com/rapidsai/cudf/blob/HEAD/CONTRIBUTING.md).
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.

Co-authored-by: Vyas Ramasubramani <vyasr@nvidia.com>
Matt711 and others added 6 commits January 27, 2025 09:34
…e string cols with chunked parquet reader (#17702)

Closes #17692.

This PR enables computing the `str_offset` required to correctly compute the offsets columns for nested large strings columns with chunked Parquet reader when `chunk_read_limit` is small resulting in multiple output table chunks per subpass.

Authors:
  - Muhammad Haseeb (https://github.com/mhaseeb123)

Approvers:
  - Yunsong Wang (https://github.com/PointKernel)
  - Ed Seidl (https://github.com/etseidl)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17702
@github-actions github-actions bot added the cudf-polars Issues specific to cudf-polars label Jan 28, 2025
wence- and others added 3 commits January 28, 2025 07:20
A small new string feature.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)

Approvers:
  - Matthew Murray (https://github.com/Matt711)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #17755
This PR applies `ruff` (`check` and `format`) everywhere, including notebooks and utility scripts. This allows us to drop our use of `nbqa`, since `ruff` natively supports notebooks. (xref: #17819, #17805)

I manually updated a few notebooks that were using old NumPy syntax for generating random values.
Closes #17461.

I also updated the `ruff` version to 0.9.3.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - Matthew Murray (https://github.com/Matt711)

URL: #17820
Before embarking on more rolling window performance optimizations and code changes, let's introduce some new benchmarks:

- demonstrating bad algorithmic behavior of large window rolling aggregations;
- of the range-based rolling interface.

Authors:
  - Lawrence Mitchell (https://github.com/wence-)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Bradley Dice (https://github.com/bdice)
  - Vukasin Milovanovic (https://github.com/vuule)

URL: #17787
@rapids-bot rapids-bot bot requested a review from a team as a code owner January 28, 2025 08:50
@github-actions github-actions bot added the CMake CMake build issue label Jan 28, 2025
galipremsagar and others added 4 commits January 28, 2025 09:45
There is a timeout failure in nightly tests: https://github.com/rapidsai/cudf/actions/runs/12983287834/job/36204344253

It looks like CI runs can get very slow at times, hence bumping up the timeout. This test basically guards us to test against a hang, so 20s timeout should be good too.

Authors:
  - GALI PREM SAGAR (https://github.com/galipremsagar)

Approvers:
  - Matthew Murray (https://github.com/Matt711)

URL: #17829
…pat (#17822)

closes #17786

Authors:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Lawrence Mitchell (https://github.com/wence-)

URL: #17822
The previous [strings PR](#17286) significantly reduced the parquet reader string performance for very-long strings, for lengths ~1024 and longer. This PR fixes the performance issue by instituting a max memcpy length of 8 bytes at once (this length yielded best perf). Also, up to all of the threads in the block can work on the same string, rather than limiting it to just all of the threads in a warp. 

**PERFORMANCE:**
Short strings: Unchanged
Length 1024: 25% faster
Longer lengths (up to 64k): Up to 90% faster, same as before strings PR

Authors:
  - Paul Mattione (https://github.com/pmattione-nvidia)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Muhammad Haseeb (https://github.com/mhaseeb123)

URL: #17773
…ader (#17708)

Closes #17689 

This PR resolves a bug in the multi-batch JSON reader, wherein the reader was throwing an error when the column schema for any two partial tables from different batches did not match. We now enforce the column ordering in the first partial table i.e. the table returned by the first batch in all succeeding batches.
The test added passes three string as three separate batches to the reader by setting the batch size to that of the first string.

Authors:
  - Shruti Shivakumar (https://github.com/shrshi)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Vukasin Milovanovic (https://github.com/vuule)
  - Karthikeyan (https://github.com/karthikeyann)
  - Paul Mattione (https://github.com/pmattione-nvidia)

URL: #17708
@vyasr
Copy link
Contributor

vyasr commented Jan 28, 2025

Closing so that the bot can open a new PR for the next set of changes now that #17828 is merged with most of the commits.

@vyasr vyasr closed this Jan 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CMake CMake build issue cudf.pandas Issues specific to cudf.pandas cudf-polars Issues specific to cudf-polars libcudf Affects libcudf (C++/CUDA) code. pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.