-
Notifications
You must be signed in to change notification settings - Fork 914
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEA] Expose public stream-ordered C++ APIs #13744
Labels
2 - In Progress
Currently a work in progress
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Milestone
Comments
vyasr
added
feature request
New feature or request
Needs Triage
Need team to review and classify
labels
Jul 24, 2023
GregoryKimball
added
libcudf
Affects libcudf (C++/CUDA) code.
2 - In Progress
Currently a work in progress
and removed
Needs Triage
Need team to review and classify
labels
Jul 25, 2023
This was referenced Sep 7, 2023
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 18, 2023
Add stream parameter to public strings APIs: - `cudf::strings::capitalize()` - `cudf::strings::title()` - `cudf::strings::is_title()` - `cudf::strings::to_lower()` - `cudf::strings::to_upper()` - `cudf::strings::swapcase()` Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Mark Harris (https://github.com/harrism) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14056
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 21, 2023
Add stream parameter to public APIs: - `cudf::strings::find()` - `cudf::strings::rfind()` - `cudf::strings::contains()` - `cudf::strings::starts_with()` - `cudf::strings::ends_with()` - `cudf::strings::findall()` - `cudf::strings::find_multiple()` Also cleaned up some of the doxygen comments. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Vukasin Milovanovic (https://github.com/vuule) URL: #14060
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 22, 2023
Add stream parameter to public APIs: - `nvtext::generate_ngrams()` - `nvtext::generate_character_ngrams()` - `nvtext::hash_character_ngrams()` - `nvtext::ngrams_tokenize()` Also cleaned up some of the doxygen comments. And also fixed a spelling mistake in the jaccard.cu source that was bothering me. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Yunsong Wang (https://github.com/PointKernel) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14061
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 25, 2023
This PR adds stream parameter to public dictionary APIs, which include: 1. `cudf::dictionary::encode` 2. `cudf::dictionary::decode` 3. `cudf::dictionary::get_index` 4. `cudf::dictionary::add_keys` 5. `cudf::dictionary::remove_keys` 6. `cudf::dictionary::remove_unused_keys` 7. `cudf::dictionary::set_keys` 8. `cudf::dictionary::match_dictionaries` Reference [13744](#13744) Authors: - Suraj Aralihalli (https://github.com/SurajAralihalli) - Yunsong Wang (https://github.com/PointKernel) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #14115
This was referenced Oct 4, 2023
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 10, 2023
Add stream parameter to public APIs: - `cudf::strings::strip()` - `cudf::strings::slice_strings()` - `cudf::strings::pad()` - `cudf::strings::zfill()` - `cudf::strings::wrap()` Also cleaned up some of the doxygen comments and added stream-tests. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Nghia Truong (https://github.com/ttnghia) URL: #14260
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 11, 2023
Follow on to PR #13997 which did not include all the split APIs or a stream test. Add stream parameter to public APIs: - `cudf::strings::partition()` - `cudf::strings::rpartition()` - `cudf::strings::split_re()` - `cudf::strings::rsplit_re()` - `cudf::strings::split_record_re()` - `cudf::strings::rsplit_record_re()` Also cleaned up some of the doxygen comments. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Mark Harris (https://github.com/harrism) - Bradley Dice (https://github.com/bdice) - Nghia Truong (https://github.com/ttnghia) URL: #14247
This was referenced Oct 12, 2023
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 12, 2023
Add stream parameter to public APIs: - `cudf::strings::replace()` (x2) - `cudf::strings::replace_slice()` - `cudf::strings::replace_re()` (x2) - `cudf::strings::replace_with_backrefs()` Also cleaned up some of the doxygen comments and added stream-tests. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - Bradley Dice (https://github.com/bdice) - Mike Wilson (https://github.com/hyperbolic2346) - Nghia Truong (https://github.com/ttnghia) URL: #14261
This was referenced Oct 13, 2023
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 16, 2023
Add stream parameter to public APIs: - `cudf::strings::to_booleans()` - `cudf::strings::from_booleans()` - `cudf::strings::to_timestamps()` - `cudf::strings::from_timestamps()` - `cudf::strings::is_timestamp()` - `cudf::strings::to_durations()` - `cudf::strings::from_durations()` - `cudf::strings::to_fixed_point()` - `cudf::strings::from_fixed_point()` - `cudf::strings::to_floats()` - `cudf::strings::from_floats()` - `cudf::strings::is_float()` - `cudf::strings::to_integers()` - `cudf::strings::from_integers()` - `cudf::strings::is_integer()` - `cudf::strings::hex_to_integers()` - `cudf::strings::integers_to_hex()` - `cudf::strings::is_hex()` - `cudf::strings::ipv4_to_integers()` - `cudf::strings::integers_to_ipv4()` - `cudf::strings::is_ipv4()` - `cudf::strings::url_encode()` - `cudf::strings::url_decode()` - `cudf::strings::format_list_column()` Also cleaned up some of the doxygen comments and removed some default parameters. Reference #13744 Authors: - David Wendt (https://github.com/davidwendt) Approvers: - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #14255
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 16, 2023
This PR introduces the stream parameter to the List Sorting and Filtering APIs. Sorting and Filtering (`extract.hpp`, `filling.hpp`, `gather.hpp`, `reverse.hpp`, `sorting.hpp`, `stream_compaction.hpp`) ``` extract_list_element - index extract_list_element - indices segmented_gather sequences - without steps sequences - with steps reverse sort_lists stable_sort_lists apply_boolean_mask distinct ``` Reference [13744](#13744) Authors: - Suraj Aralihalli (https://github.com/SurajAralihalli) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vyas Ramasubramani (https://github.com/vyasr) URL: #14272
3 tasks
Thanks @lamarrr for reworking this issue! |
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 20, 2024
Adds stream ordering to the public join APIs: - `inner_join` - `left_join` - `full_join` - `left_semi_join` - `left_anti_join` - `cross_join` - `conditional_inner_join` - `conditional_left_join` - `conditional_full_join` - `conditional_left_semi_join` - `conditional_left_anti_join` - `mixed_inner_join` - `mixed_left_join` - `mixed_full_join` - `mixed_left_semi_join` - `mixed_left_anti_join` - `mixed_inner_join_size` - `mixed_left_join_size` - `conditional_inner_join_size` - `conditional_left_join_size` - `conditional_left_semi_join_size` - `conditional_left_anti_join_size` closes #16792 follows up #13744 Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: #16793
rapids-bot bot
pushed a commit
that referenced
this issue
Sep 21, 2024
This merge request exposes stream-ordering to the public-facing datetime APIs. - `extract_year` - `extract_month` - `extract_day` - `extract_weekday` - `extract_hour` - `extract_minute` - `extract_second` - `extract_millisecond_fraction` - `extract_microsecond_fraction` - `extract_nanosecond_fraction` - `last_day_of_month` - `day_of_year` - `add_calendrical_months` - `is_leap_year` - `days_in_month` - `extract_quarter` - `ceil_datetimes` - `floor_datetimes` - `round_datetimes` Follows-up #13744 Closes #16775 Authors: - Basit Ayantunde (https://github.com/lamarrr) Approvers: - Karthikeyan (https://github.com/karthikeyann) - Yunsong Wang (https://github.com/PointKernel) URL: #16774
This was referenced Oct 30, 2024
rapids-bot bot
pushed a commit
that referenced
this issue
Oct 31, 2024
Add stream parameter to public APIs: ``` cudf::partition cudf::round_robin_partition ``` Added stream gtest for above two functions and for `hash_partition`. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Nghia Truong (https://github.com/ttnghia) - Vukasin Milovanovic (https://github.com/vuule) URL: #17213
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 1, 2024
Contributes to #13744 Authors: - Matthew Murray (https://github.com/Matt711) Approvers: - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #16925
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 4, 2024
Add stream parameter to public APIs: ``` nvtext::subword_tokenize nvtext::load_vocabulary_file ``` Added stream gtest. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) - Muhammad Haseeb (https://github.com/mhaseeb123) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) - Muhammad Haseeb (https://github.com/mhaseeb123) URL: #17206
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 12, 2024
Adds stream parameter to ``` cudf::quantile cudf::quantiles cudf::percentile_approx ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - David Wendt (https://github.com/davidwendt) URL: #17257
This was referenced Nov 12, 2024
Closed
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 13, 2024
Adds stream parameter to `cudf::transpose`. Verifies correct stream forwarding with stream gtests. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Nghia Truong (https://github.com/ttnghia) - David Wendt (https://github.com/davidwendt) URL: #17294
This was referenced Nov 14, 2024
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 21, 2024
Adds stream parameter to ``` cudf::from_dlpack cudf::to_dlpack ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - David Wendt (https://github.com/davidwendt) - Tianyu Liu (https://github.com/kingcrimsontianyu) URL: #17397
3 tasks
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 25, 2024
Adds stream parameter to ``` cudf::groupby::scan cudf::groupby::aggregate cudf::groupby::shift cudf::groupby::get_groups cudf::groupby::replace_nulls ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - Paul Mattione (https://github.com/pmattione-nvidia) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17324
rapids-bot bot
pushed a commit
that referenced
this issue
Nov 25, 2024
Adds stream parameter to ``` cudf::strings::count_characters cudf::strings::count_bytes cudf::strings::code_points ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - David Wendt (https://github.com/davidwendt) - Nghia Truong (https://github.com/ttnghia) - Bradley Dice (https://github.com/bdice) URL: #17398
This was referenced Nov 25, 2024
rapids-bot bot
pushed a commit
that referenced
this issue
Dec 1, 2024
Adds stream parameter to ``` cudf::normalize_nans_and_zeros ``` Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) Approvers: - MithunR (https://github.com/mythrocks) - Nghia Truong (https://github.com/ttnghia) URL: #17436
rapids-bot bot
pushed a commit
that referenced
this issue
Dec 4, 2024
Adds stream parameter to ``` cudf::nvtext::byte_pair_encoding ``` Added stream gtests to verify correct stream forwarding. Reference: #13744 Authors: - Shruti Shivakumar (https://github.com/shrshi) - David Wendt (https://github.com/davidwendt) Approvers: - Yunsong Wang (https://github.com/PointKernel) - David Wendt (https://github.com/davidwendt) URL: #17446
3 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
2 - In Progress
Currently a work in progress
feature request
New feature or request
libcudf
Affects libcudf (C++/CUDA) code.
Is your feature request related to a problem? Please describe.
Currently most libcudf APIs operate implicitly on the default CUDA stream. As documented in #925, we would like to expose stream-ordered APIs publicly. Prior to doing so, we needed to do a great deal of work to ensure that libcudf's internals were properly stream-ordered. Among the important tasks here were removing implicit APIs that ran on a stream (see #11968) and implementing a strategy for verifying that streams are properly forwarded from APIs all the way down to the lowest level functions (see #11943). Now that all such work that we are aware of has been completed, libcudf should start exposing streams in public APIs.
We originally considered adding stream support via a monolithic feature branch where APIs would be modified one at a time. From various discussions, however, we decided that the overhead of maintaining a feature branch would grow too large to be worthwhile, so we have instead decided to proceed by incrementally exposing streams in libcudf APIs over time. PRs adding new APIs should always include a stream in the public API. Existing APIs will be transitioned one header file at a time, with PRs going directly into the current main branch of cudf.
The purpose of this PR is to document progress on the task of transitioning existing headers, as well as to describe the process by which new stream-ordered APIs should be added.
Describe the solution you'd like
When adding stream support to an API, the stream parameter should be placed just before the mr parameter with a default value of
cudf::get_default_stream()
. A new test should be added that validates that the API properly forwards the stream to all CUDA calls, which can be done by following the instructions at the bottom of libcudf's testing developer docs.To divide up the work, we consider public headers organized by directories (all relative to
cpp/include/cudf
). Devs may assign themselves a directory within which to own all headers and the public APIs contained therein./cudf APIs
stream
param in transform APIs #16452/nvtext APIs
Additional context
Before we began making public stream parameters, libcudf public APIs would be paired with detail APIs that are nearly identical except for having a required stream parameter. In some cases, though, the public API may not currently have a detail "mirror" function but instead call multiple detail APIs. In those cases we may need to make a case-by-case determination of whether a new detail function should also be added. Some discussion of whether we want this pattern to be present everywhere, and the reasons why we might want that, are discussed in #14276.
Public stream API readiness:
The text was updated successfully, but these errors were encountered: