-
Notifications
You must be signed in to change notification settings - Fork 1k
Use Span thoughout pylibcudf
#21087
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Use Span thoughout pylibcudf
#21087
Conversation
| from .table cimport Table | ||
|
|
||
|
|
||
| cdef device_span[uint8_t] _get_device_span(object obj) except * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want this to be in pylibcudf.utils and be capable of creating different types of device_spans (e.g. device_span<byte> which I'll need for cudf::table_to_array). I tried to use a fused type as the template type, but Cython cannot infer the type I want. Here is the compile error I get.
Return type is not specified as argument type
Since the input here input is just object object, the cython cannot infer the return type. Any ideas here @vyasr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cython doesn't support this natively. The options are either to write separate Cython functions that return each concrete type and then having a wrapper function that dispatches to the right one (e.g. based on an enum value) or to use inline C++ to define the function so that you can export a templated version directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the only APIs that are really relevant for #21069 are in contiguous split and reshape (please correct me if I'm missing something). Since its only two, its probably okay to have two seperate functions to create device_span[uint8_t] and a device_span[byte].
I'm going to punt the reshape API (ie. pylibcudf.reshape.table_to_array) to a follow up PR because it would be a breaking change. Currently the API takes in a ptr and a size. The new API would take Span, so we'd probably have to deprecate it.
Aside: There are other places like in I/O types and null mask. But they really only need type checks using is_span (I've added them in this PR).
Span in contiguous split APIsSpan thoughout pylibcudf
Add test_basic_indexing and test_iloc_float_raises[string-python-Series] to NODEIDS_THAT_ARE_FLAKY_WITH_COPY_ON_WRITE to prevent non-deterministic test failures in the pandas compatibility test suite. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Add test coverage for TypeError raised when non-Span objects are passed to `copy_bitmask_from_bitmask` and `null_count` functions, and ValueError raised when empty metadata is provided with non-empty data to `unpack_from_memoryviews`. Co-Authored-By: Claude Sonnet 4.5 <[email protected]>
Description
Contributes to #21069 by updating pylibcudf APIs to accept any type supporting the Span protocol.
Checklist