[Dynamic Selection] Customization of Backends and Policies #2508

danhoeflinger · 2025-11-05T01:31:11Z

Dynamic Selection API: Backend and Policy Customization & Removal of Selection API

This PR refactors the Dynamic Selection API to introduce a flexible backend architecture and simplify the user-facing API by removing the select() function. It provides better tools for customization of backend and policies to allow for easier customization and more flexibility for users.

Implements RFCs #2220 and #2489 (without token policy, that will be a separate PR).

Key Changes

Backend Architecture

Added policy_base.h and default_backend.h to provide common base classes for policies and backends respectively
Backends now accept ResourceAdapter function to support different flavors of resource with the same backend (e.g. sycl::queue vs sycl::queue*)

API Simplification

Removed select() function - selections are now internal implementation details
Addition of try_submit- always returns a std::optional quickly, returns empty if unable to obtain a resource
Users now exclusively use try_submit, submit() and submit_and_wait() functions
Policies expose try_select_impl() (returns std::optional) instead of public select()

Execution Info & Reporting

Backends validate reporting requirements at construction and filter incompatible resources
Clear runtime errors when no compatible resources remain after filtering

Summary of changes to look out for from individual components:

Documentation:
- Removal of public select API
Examples:
- Policy template argument adjustment (by resource first rather than backend)
Policies:
- Adjustment to use policy_base, remove repetative code
- Implement try_select_impl instead of select
- Implement initialize_impl instead of initialize
Sycl Backend:
- Conversion to partial specialization of default backend, use of backend_base
- Addition of ResourceAdapter to support different flavors from a single base resource / backend
- Improve system of reporting requirements for better filtering and error messages
- Adjust support for profiling to require sycl extension for profiling (only reliable way to time)
  - Filter devices accordingly
Tests:
- Removal of testing of select before submit
- Addition of default universe initialization testing
- Addition of testing of use of resource adapter to support sycl::queue* in addition to sycl::queue

…t after, also removed submit from the sycl_backend. Now uses the base implementation

Signed-off-by: Dan Hoeflinger <[email protected]>

include/oneapi/dpl/internal/dynamic_selection_impl/policy_base.h

vossmjp · 2025-11-13T17:16:30Z

documentation/library_guide/dynamic_selection_api/auto_tune_policy.rst

-  {
-    s.report(execution_info::task_submission);
-  }
+    - Returns an object that can be used to wait for all active submissions.


I think we need to add a topic on writing custom policies and another on writing custom backends. Maybe this information belongs somewhere one of those as a table. Its extraneous to actually using the policies so it may be confusing to retain in the individual policy topics.

include/oneapi/dpl/internal/dynamic_selection_impl/policy_base.h

Signed-off-by: Dan Hoeflinger <[email protected]>

include/oneapi/dpl/internal/dynamic_selection_traits.h

SergeyKopienko · 2025-11-14T09:06:35Z

include/oneapi/dpl/internal/dynamic_selection_impl/auto_tune_policy.h


-    auto
-    get_submission_group()
+    auto_tune_policy(deferred_initialization_t) {}


Should we mark this constructor as explicit ?

I'm not sure what we benefit from this. I suppose we could, but basically we are preventing
auto_tune_policy p = oneapi::dpl::experimental::deferred_initialization;
Is there another advantage I'm missing here?

SergeyKopienko · 2025-11-14T09:25:21Z

include/oneapi/dpl/internal/dynamic_selection_impl/auto_tune_policy.h

            if (index == use_best_resource)
            {
-                return selection_type{*this, t->best_resource_, t};
+                return std::make_shared<selection_type>(*this, t->best_resource_, t);


Should we call this make_shared under mutex?

I think you're right. I think for the sycl backend, where queues have reference semantics, it's a benign race on the value of best_resource_. But in general it could be an issue.

This is within a std::lock_guard<std::mutex> RAII scope. Is the question in the other direction for performance reasons (make_shared is too costly and might be able to outside the mutexed section)?
Otherwise, I'm not sure I understand the comment.

SergeyKopienko · 2025-11-14T09:25:28Z

include/oneapi/dpl/internal/dynamic_selection_impl/auto_tune_policy.h

            {
                auto r = state_->resources_with_index_[index];
-                return selection_type{*this, r, t};
+                return std::make_shared<selection_type>(*this, r, t);


SergeyKopienko · 2025-11-14T09:32:25Z

include/oneapi/dpl/internal/dynamic_selection_impl/default_backend.h

+        static_assert(sizeof...(ReportReqs) == 0, "Default backend does not support reporting");
+
+        for (const auto& e : u)
+            resources_.push_back(e);


Can we avoid this per-element copy?
For example, may be just move all u into resources_ ?

This is a reasonable question about dynamic selection in general. This pattern is how the code was originally written and accepted. Addressing these concerns of non-trivial resource types is something that may be worth doing, but I don't think we want to continue to expand the scope of this PR to do this across all of dynamic selection.

For now, to remain consistent with the way this works in dynamic selection, I'd suggest to keep it and possibly address this in a future standalone PR across all of this feature.

SergeyKopienko · 2025-11-14T09:37:36Z

include/oneapi/dpl/internal/dynamic_selection_impl/default_backend.h

+    }
+
+    auto
+    get_submission_group()


Should this function be declared as const ?

This implementation means you copy std::vector instance (so with memory allocation just for wait
Looks not quite correct...
Example:

s.get_submission_group().wait();

It only captures a reference to the resources, not a copy.
Also because it holds a non-const reference in the value returned, we shouldn't mark this const.

include/oneapi/dpl/internal/dynamic_selection_impl/default_backend.h

SergeyKopienko · 2025-11-14T09:40:28Z

include/oneapi/dpl/internal/dynamic_selection_impl/default_backend.h

+    using my_base = backend_base<ResourceType>;
+
+    template <typename... ReportReqs>
+    default_backend_impl(ReportReqs... reqs) : my_base(reqs...)


Should we use && and std::forward here?

I don't think so. These are known to be empty tag structures indicating reporting requirements with no member fields. Forwarding doesn't save anything and just adds complexity.

SergeyKopienko · 2025-11-14T09:40:38Z

include/oneapi/dpl/internal/dynamic_selection_impl/default_backend.h

+
+  public:
+    template <typename... ReportReqs>
+    default_backend(ReportReqs... reqs) : base_t(reqs...)


SergeyKopienko · 2025-11-14T09:41:40Z

include/oneapi/dpl/internal/dynamic_selection_impl/dynamic_load_policy.h

-    using resource_type = typename backend_t::resource_type;
-    using wait_type = typename backend_t::wait_type;
+    using execution_resource_t = typename base_t::execution_resource_t;
+    using load_t = int;


Should we really have signed type here?

I wouldn't really object to this, but I also dont think its really necessary.
We will never use that extra bit to represent load. If the load of something is ever greater than 2 billion, I'd be very surprised.
I dont think that using unsigned for values that cannot be negative is really good "documentation" either. It tends to be more likely to hide underflow bugs than fix them.

SergeyKopienko · 2025-11-14T09:43:58Z

include/oneapi/dpl/internal/dynamic_selection_impl/dynamic_load_policy.h

-    get_submission_group()
+    dynamic_load_policy() { base_t::initialize(); }
+    dynamic_load_policy(deferred_initialization_t) {}
+    dynamic_load_policy(const std::vector<ResourceType>& u, ResourceAdapter adapter = {})


What about to replace const std::vector<...>& -> std::vector<...>&& in class constructors and etc.
At this and in all other places like that.

Similar to my response above, I think this is something we could address in a standalone PR rather than expanding the scope of this PR way beyond customization concerns.

Signed-off-by: Dan Hoeflinger <[email protected]>

egfefey added 30 commits March 14, 2025 00:14

Deactivating policies beside RR for custom backend testing

34d5f88

default base + RR updates for default sycl backend

c615e4a

adding missed updated sycl_backend

e8c0fb3

updating fixed resource policy and test to use default sycl backend

ce7b687

updating dynamic load policy to use default sycl backend

d5cce9f

updating the auto_tune policy to use the default sycl backend

5f839ea

split submit into instument_before, function invocation and instrumen…

cc2f8b2

…t after, also removed submit from the sycl_backend. Now uses the base implementation

updating the backend traits with a scratch space for selection

6ec6be1

warning fix from previous push

1a0c620

adding missing condition for scratch space use

0411b61

initial policy base plus round robin inheriting

a2b7175

initial policy customization - policy base plus inherited round robin

a928d16

fixed resource policy using the policy_base

6e517e7

adding exception in select

9a65385

Deactivating policies beside RR for custom backend testing

22513bb

default base + RR updates for default sycl backend

fb965aa

adding missed updated sycl_backend

d442372

updating fixed resource policy and test to use default sycl backend

8a8e8ff

updating dynamic load policy to use default sycl backend

d547a3e

updating the auto_tune policy to use the default sycl backend

f2573eb

split submit into instument_before, function invocation and instrumen…

0d26cbe

…t after, also removed submit from the sycl_backend. Now uses the base implementation

updating the backend traits with a scratch space for selection

ef6e6e6

warning fix from previous push

9a62576

adding missing condition for scratch space use

b77e08a

initial policy base plus round robin inheriting

1c41842

initial policy customization - policy base plus inherited round robin

ad698d5

fixed resource policy using the policy_base

6459df0

adding exception in select

3ab72c9

Rebasing...

92a9ec6

updates to enable non-sycl samples plus a traits check for wait

0d66c06

danhoeflinger and others added 2 commits November 13, 2025 10:47

formatting

99799c2

Signed-off-by: Dan Hoeflinger <[email protected]>

adding CTAD for dynamic_load_policy

2247cec

vossmjp reviewed Nov 13, 2025

View reviewed changes

include/oneapi/dpl/internal/dynamic_selection_impl/policy_base.h Show resolved Hide resolved

danhoeflinger added 5 commits November 13, 2025 16:08

removing thread

e08ca68

Signed-off-by: Dan Hoeflinger <[email protected]>

remove CRTP from backend_base

e330649

Signed-off-by: Dan Hoeflinger <[email protected]>

protect impl functions and make base friend

daffa53

Signed-off-by: Dan Hoeflinger <[email protected]>

formatting

41536f2

Signed-off-by: Dan Hoeflinger <[email protected]>

adding adapter tests for auto_tune_policy

ca59727

Signed-off-by: Dan Hoeflinger <[email protected]>