Skip to content

Conversation

@Matt711
Copy link
Contributor

@Matt711 Matt711 commented Jan 20, 2026

Description

Contributes to #21069 by updating pylibcudf APIs to accept any type supporting the Span protocol.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@Matt711 Matt711 requested a review from a team as a code owner January 20, 2026 17:35
@github-actions github-actions bot added Python Affects Python cuDF API. pylibcudf Issues specific to the pylibcudf package labels Jan 20, 2026
@GPUtester GPUtester moved this to In Progress in cuDF Python Jan 20, 2026
from .table cimport Table


cdef device_span[uint8_t] _get_device_span(object obj) except *
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want this to be in pylibcudf.utils and be capable of creating different types of device_spans (e.g. device_span<byte> which I'll need for cudf::table_to_array). I tried to use a fused type as the template type, but Cython cannot infer the type I want. Here is the compile error I get.

Return type is not specified as argument type

Since the input here input is just object object, the cython cannot infer the return type. Any ideas here @vyasr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cython doesn't support this natively. The options are either to write separate Cython functions that return each concrete type and then having a wrapper function that dispatches to the right one (e.g. based on an enum value) or to use inline C++ to define the function so that you can export a templated version directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only APIs that are really relevant for #21069 are in contiguous split and reshape (please correct me if I'm missing something). Since its only two, its probably okay to have two seperate functions to create device_span[uint8_t] and a device_span[byte].

I'm going to punt the reshape API (ie. pylibcudf.reshape.table_to_array) to a follow up PR because it would be a breaking change. Currently the API takes in a ptr and a size. The new API would take Span, so we'd probably have to deprecate it.

Aside: There are other places like in I/O types and null mask. But they really only need type checks using is_span (I've added them in this PR).

@Matt711 Matt711 added improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Jan 20, 2026
@Matt711 Matt711 changed the title Use Span in contiguous split APIs Use Span thoughout pylibcudf Jan 21, 2026
Matt711 and others added 5 commits January 20, 2026 21:51
Add test_basic_indexing and test_iloc_float_raises[string-python-Series]
to NODEIDS_THAT_ARE_FLAKY_WITH_COPY_ON_WRITE to prevent non-deterministic
test failures in the pandas compatibility test suite.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
@github-actions github-actions bot added the cudf.pandas Issues specific to cudf.pandas label Jan 21, 2026
Matt711 and others added 2 commits January 21, 2026 16:33
Add test coverage for TypeError raised when non-Span objects are passed to
`copy_bitmask_from_bitmask` and `null_count` functions, and ValueError raised
when empty metadata is provided with non-empty data to `unpack_from_memoryviews`.

Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cudf.pandas Issues specific to cudf.pandas improvement Improvement / enhancement to an existing function non-breaking Non-breaking change pylibcudf Issues specific to the pylibcudf package Python Affects Python cuDF API.

Projects

Status: In Progress

Development

Successfully merging this pull request may close these issues.

2 participants