Skip to content

Conversation

@danhoeflinger
Copy link
Contributor

@danhoeflinger danhoeflinger commented Nov 5, 2025

Dynamic Selection API: Backend and Policy Customization & Removal of Selection API

This PR refactors the Dynamic Selection API to introduce a flexible backend architecture and simplify the user-facing API by removing the select() function. It provides better tools for customization of backend and policies to allow for easier customization and more flexibility for users.

Implements RFCs #2220 and #2489 (without token policy, that will be a separate PR).

Key Changes

Backend Architecture

  • Added policy_base.h and default_backend.h to provide common base classes for policies and backends respectively
  • Backends now accept ResourceAdapter function to support different flavors of resource with the same backend (e.g. sycl::queue vs sycl::queue*)

API Simplification

  • Removed select() function - selections are now internal implementation details
  • Addition of try_submit- always returns a std::optional quickly, returns empty if unable to obtain a resource
  • Users now exclusively use try_submit, submit() and submit_and_wait() functions
  • Policies expose try_select_impl() (returns std::optional) instead of public select()

Execution Info & Reporting

  • Backends validate reporting requirements at construction and filter incompatible resources
  • Clear runtime errors when no compatible resources remain after filtering

Summary of changes to look out for from individual components:

  • Documentation:
    • Removal of public select API
  • Examples:
    • Policy template argument adjustment (by resource first rather than backend)
  • Policies:
    • Adjustment to use policy_base, remove repetative code
    • Implement try_select_impl instead of select
    • Implement initialize_impl instead of initialize
  • Sycl Backend:
    • Conversion to partial specialization of default backend, use of backend_base
    • Addition of ResourceAdapter to support different flavors from a single base resource / backend
    • Improve system of reporting requirements for better filtering and error messages
    • Adjust support for profiling to require sycl extension for profiling (only reliable way to time)
      • Filter devices accordingly
  • Tests:
    • Removal of testing of select before submit
    • Addition of default universe initialization testing
    • Addition of testing of use of resource adapter to support sycl::queue* in addition to sycl::queue

egfefey added 30 commits March 14, 2025 00:14
…t after, also removed submit from the sycl_backend. Now uses the base implementation
…t after, also removed submit from the sycl_backend. Now uses the base implementation
{
s.report(execution_info::task_submission);
}
- Returns an object that can be used to wait for all active submissions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add a topic on writing custom policies and another on writing custom backends. Maybe this information belongs somewhere one of those as a table. Its extraneous to actually using the policies so it may be confusing to retain in the individual policy topics.

Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>
Signed-off-by: Dan Hoeflinger <[email protected]>

auto
get_submission_group()
auto_tune_policy(deferred_initialization_t) {}
Copy link
Contributor

@SergeyKopienko SergeyKopienko Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we mark this constructor as explicit ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what we benefit from this. I suppose we could, but basically we are preventing
auto_tune_policy p = oneapi::dpl::experimental::deferred_initialization;
Is there another advantage I'm missing here?

if (index == use_best_resource)
{
return selection_type{*this, t->best_resource_, t};
return std::make_shared<selection_type>(*this, t->best_resource_, t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this make_shared under mutex?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right. I think for the sycl backend, where queues have reference semantics, it's a benign race on the value of best_resource_. But in general it could be an issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is within a std::lock_guard<std::mutex> RAII scope. Is the question in the other direction for performance reasons (make_shared is too costly and might be able to outside the mutexed section)?
Otherwise, I'm not sure I understand the comment.

{
auto r = state_->resources_with_index_[index];
return selection_type{*this, r, t};
return std::make_shared<selection_type>(*this, r, t);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

static_assert(sizeof...(ReportReqs) == 0, "Default backend does not support reporting");

for (const auto& e : u)
resources_.push_back(e);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this per-element copy?
For example, may be just move all u into resources_ ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a reasonable question about dynamic selection in general. This pattern is how the code was originally written and accepted. Addressing these concerns of non-trivial resource types is something that may be worth doing, but I don't think we want to continue to expand the scope of this PR to do this across all of dynamic selection.

For now, to remain consistent with the way this works in dynamic selection, I'd suggest to keep it and possibly address this in a future standalone PR across all of this feature.

}

auto
get_submission_group()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Should this function be declared as const ?
  2. This implementation means you copy std::vector instance (so with memory allocation just for wait
    Looks not quite correct...
    Example:
s.get_submission_group().wait();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It only captures a reference to the resources, not a copy.
Also because it holds a non-const reference in the value returned, we shouldn't mark this const.

using my_base = backend_base<ResourceType>;

template <typename... ReportReqs>
default_backend_impl(ReportReqs... reqs) : my_base(reqs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use && and std::forward here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. These are known to be empty tag structures indicating reporting requirements with no member fields. Forwarding doesn't save anything and just adds complexity.


public:
template <typename... ReportReqs>
default_backend(ReportReqs... reqs) : base_t(reqs...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the same

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see above

using resource_type = typename backend_t::resource_type;
using wait_type = typename backend_t::wait_type;
using execution_resource_t = typename base_t::execution_resource_t;
using load_t = int;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really have signed type here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't really object to this, but I also dont think its really necessary.
We will never use that extra bit to represent load. If the load of something is ever greater than 2 billion, I'd be very surprised.
I dont think that using unsigned for values that cannot be negative is really good "documentation" either. It tends to be more likely to hide underflow bugs than fix them.

get_submission_group()
dynamic_load_policy() { base_t::initialize(); }
dynamic_load_policy(deferred_initialization_t) {}
dynamic_load_policy(const std::vector<ResourceType>& u, ResourceAdapter adapter = {})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about to replace const std::vector<...>& -> std::vector<...>&& in class constructors and etc.
At this and in all other places like that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to my response above, I think this is something we could address in a standalone PR rather than expanding the scope of this PR way beyond customization concerns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants