Skip to content

Set Algorithms Performance Improvements #2147

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 145 commits into from
May 5, 2025
Merged

Conversation

danhoeflinger
Copy link
Contributor

@danhoeflinger danhoeflinger commented Mar 26, 2025

Implement balanced path with reduce then scan skeleton.
Performance improvements for set_intersection, set_difference, set_union, and set_symmetric_difference.

Adds a non-exhaustive set of unit tests for balanced path building block functions in balanced_path_unit_tests.pass.
Changes set tests to encourage more overlap between inputs (better coverage).

@danhoeflinger danhoeflinger force-pushed the dev/dhoeflin/balanced_path branch from 2af88f3 to 65fe7b6 Compare March 27, 2025 19:21
@danhoeflinger danhoeflinger changed the title [Draft] balanced path set operations Set Algorithms Performance Improvements Apr 7, 2025
@danhoeflinger danhoeflinger marked this pull request as ready for review April 7, 2025 12:42
@danhoeflinger danhoeflinger force-pushed the dev/dhoeflin/balanced_path branch from 8c0918d to d62d880 Compare April 7, 2025 12:49
@danhoeflinger danhoeflinger added this to the 2022.9.0 milestone Apr 7, 2025
Signed-off-by: Dan Hoeflinger <[email protected]>

should be fixes for set algs

bugfixes

cache to mask

better reusing existing code

clang format

cleanup

unfinished work
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Comment on lines 321 to 322
Sequence<T1> in1(n, [](::std::size_t k) { return rand() % (std::max(std::size_t{3},k>>4)); });
Sequence<T2> in2(m, [m](::std::size_t k) { return ((m % 2) * rand() + rand()) % (std::max(std::size_t{3},k>>4)); });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Sequence<T1> in1(n, [](::std::size_t k) { return rand() % (std::max(std::size_t{3},k>>4)); });
Sequence<T2> in2(m, [m](::std::size_t k) { return ((m % 2) * rand() + rand()) % (std::max(std::size_t{3},k>>4)); });
Sequence<T1> in1(n, [](::std::size_t k) { return rand() % (std::max(3ul,k>>4)); });
Sequence<T2> in2(m, [m](::std::size_t k) { return ((m % 2) * rand() + rand()) % (std::max({3ul,k>>4)); });

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if we can be sure about 3ul matching type with std::size_t. 3ul type depends upon the system environment. What is there is more verbose, but more reliable cross platform I believe,.

}
else
{
typedef typename ::std::iterator_traits<_OutputIterator>::value_type _ValueType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're just moving this code but it's a good opportunity to modernize.

Suggested change
typedef typename ::std::iterator_traits<_OutputIterator>::value_type _ValueType;
using _ValueType = typename std::iterator_traits<_OutputIterator>::value_type;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

using TempData = __noop_temp_data;
template <typename _InRng, typename _IndexT>
std::uint16_t
operator()(const _InRng& __in_rng, _IndexT __id, __noop_temp_data& __temp_data) const
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since TempData is defined right above this, it could make sense to use that instead.

Suggested change
operator()(const _InRng& __in_rng, _IndexT __id, __noop_temp_data& __temp_data) const
operator()(const _InRng& __in_rng, _IndexT __id, TempData& __temp_data) const

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

__find_balanced_path_start_point(__rng1, __rng2, __rng1_pos, __rng2_pos, __comp);

//use sign bit to represent star offset
__rng1_temp_diag[__id] = __rng1_balanced_pos * (__star_offset ? -1 : 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry if this has already been asked, but is it safe to use the sign bit here? Is there a potential for the position to not fit in this integral type if it now can only store half of its range?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable has the same type as the rng1.size() (or the difference_type of the first input iterator). A diagonal may only be as large as the minimum size of the two sets, so the index into the diagonal should fit in this type's positive section. All indices into the diagonal will be positive on their own, which leaves the sign bit available for this sort of info.
It would be good to add a comment or perhaps find a way to make this type more explicit.
I'll look into this.

_Compare __comp;
};

//returns iterations consumed, and the number of elements copied
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is referring to a return value but this function returns void.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Clarified this comment, returns by reference.

}
__n = __last - __first;
// get closer and closer to binary search with more iterations
__shift_right_div -= 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__shift_right_div is of type _Size1. Is that type signed or unsigned? Is there a risk for this subtraction to underflow? I suppose since it starts at 10 and is decremented by 3 each time, it shouldn't enter the loop when it reaches 1 due to the loop predicate. But is it possible to count up instead of down here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just change this to be std::int8_t?

Signed-off-by: Dan Hoeflinger <[email protected]>
SergeyKopienko and others added 9 commits April 30, 2025 21:15
# Conflicts:
#	include/oneapi/dpl/pstl/hetero/algorithm_impl_hetero.h
#	include/oneapi/dpl/pstl/hetero/dpcpp/parallel_backend_sycl.h
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nitpicks. Otherwise, this PR LGTM.

Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Copy link
Contributor

@mmichel11 mmichel11 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@danhoeflinger danhoeflinger merged commit a414da0 into main May 5, 2025
19 checks passed
@danhoeflinger danhoeflinger deleted the dev/dhoeflin/balanced_path branch May 5, 2025 16:27
timmiesmith pushed a commit that referenced this pull request Jun 9, 2025
Adds balanced path implementation within reduce then scan framework for set algorithms

---
Signed-off-by: Dan Hoeflinger <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants