Skip to content

Device Cluster Reconstruction, main branch (2025.06.19.) #1028

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

krasznaa
Copy link
Member

Following up from #1023, this is how I believe the clusterization algorithm should provide its results. And I'm very open to a discussion on this.

While I generally liked the idea of having different functions for reconstructing just the measurements or the measurements plus the clusters, I think in the end that just makes the code more complicated than it needs to be. I propose that we just have a single interface for the device clusterization algorithms. One which returns an optional cluster collection buffer. Behind the scenes this was the implementation in cuda::clusterization_algorithm already. And I think this is a functional interface towards the users as well. 🤔

After updating the CUDA algorithm, I added the functionality of reconstructing cell clusters to the Alpaka and SYCL algorithms as well.

Unfortunately however the Alpaka algorithm, on an NVIDIA backend, is not behaving well at the moment. 😭 The unit test reports that a random number of the tests fail to return the correct (number of) clusters from the algorithm. And so far I didn't manage to find yet what is causing this issue. 😦

Still, even with that open issue, I wanted to let people see my proposal already. As there may be a discussion about the user API...

As before, pinging @CrossR and @StewMH for their opinions.

P.S. Almost forgot... I also modified traccc::algorithm to allow fundamental arguments to be passed by value to an algorithm. As I really wanted to be able to pass a bool to the clusterization algorithms...

krasznaa added 5 commits June 19, 2025 14:58
So that for instance a simple boolean flag could be an algorithm
argument.
And then updated traccc::cuda::clusterization_algorithm to make use of it,
simplifying the algorithm's interface a great deal.
Following the code structure used by cuda::clusterization_algorithm. Also
updated all client code to the new API.
Following the code structure used by cuda::clusterization_algorithm. Also
updated all client code to the new API.
@krasznaa krasznaa added feature New feature or request cuda Changes related to CUDA sycl Changes related to SYCL alpaka Changes related to Alpaka labels Jun 19, 2025
Copy link
Member

@stephenswat stephenswat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't really bother reading past the Alpaka code, but could you elaborate why exactly you think this makes the code "simpler"? The net PR delta is almost +100 lines and this change makes the API for users more obtuse to use, who in the existing implementation know exactly that if they pass the right tag, they get the desired result. Returning an optional value leaves open the semantic to return an non-extant value even if the user requests the cluster data to be delivered.

In my eyes, this just makes the code less understandable for no good reason, so I'd like to understand what the benefit of this approach is.

vecmem::data::vector_buffer<device::details::index_t> m_f_backup,
m_gf_backup;
vecmem::data::vector_buffer<unsigned char> m_adjc_backup;
vecmem::data::vector_buffer<device::details::index_t> m_adjv_backup;
vecmem::unique_alloc_ptr<unsigned int> m_backup_mutex;
mutable std::once_flag m_setup_once;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unrelated change?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit, yes. I also made the algorithm receive a queue from the outside.

Comment on lines 128 to 137
return {
.measurements = {},
.clusters =
(reconstruct_clusters
? std::optional<
edm::silicon_cluster_collection::
buffer>{edm::silicon_cluster_collection::
buffer{}}
: std::optional<edm::silicon_cluster_collection::buffer>{
std::nullopt})};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's got to be a way to make this more readable. 😟

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is pretty terrible. Unfortunately ternary operators don't seem to mix too well with std::optional. 😦 But yes, probably something better could still be done...

Ideally I would've liked to write:

Suggested change
return {
.measurements = {},
.clusters =
(reconstruct_clusters
? std::optional<
edm::silicon_cluster_collection::
buffer>{edm::silicon_cluster_collection::
buffer{}}
: std::optional<edm::silicon_cluster_collection::buffer>{
std::nullopt})};
return {
.measurements = {},
.clusters = (reconstruct_clusters ? {} : std::nullopt)};

But of course that didn't work... 😦

Copy link
Member Author

@krasznaa krasznaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The +100 lines comes mainly from the fact that I'm teaching the Alpaka and SYCL algorithms how to do a new thing. (Plus I'm adding code to the unit tests to check the new behaviour.) The changeset on the CUDA code should be a net negative if you check.

I'm not convinced about the interface either. But putting 4 functions into all 3 clusterization algorithms seems to be an overkill for me.

I'm mainly thinking that client code will not just want to either reconstruct clusters or not reconstruct clusters. I think in the Athena code we will want to turn cluster reconstruction on or off through configuration options in the end. And having to write separate code paths in that code to achieve this, does not seem great. (I don't see us writing generic code for this through some clever templating there.)

vecmem::data::vector_buffer<device::details::index_t> m_f_backup,
m_gf_backup;
vecmem::data::vector_buffer<unsigned char> m_adjc_backup;
vecmem::data::vector_buffer<device::details::index_t> m_adjv_backup;
vecmem::unique_alloc_ptr<unsigned int> m_backup_mutex;
mutable std::once_flag m_setup_once;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A little bit, yes. I also made the algorithm receive a queue from the outside.

Comment on lines 128 to 137
return {
.measurements = {},
.clusters =
(reconstruct_clusters
? std::optional<
edm::silicon_cluster_collection::
buffer>{edm::silicon_cluster_collection::
buffer{}}
: std::optional<edm::silicon_cluster_collection::buffer>{
std::nullopt})};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. This is pretty terrible. Unfortunately ternary operators don't seem to mix too well with std::optional. 😦 But yes, probably something better could still be done...

Ideally I would've liked to write:

Suggested change
return {
.measurements = {},
.clusters =
(reconstruct_clusters
? std::optional<
edm::silicon_cluster_collection::
buffer>{edm::silicon_cluster_collection::
buffer{}}
: std::optional<edm::silicon_cluster_collection::buffer>{
std::nullopt})};
return {
.measurements = {},
.clusters = (reconstruct_clusters ? {} : std::nullopt)};

But of course that didn't work... 😦

Made the early returns from clusterization easier to read.
Hid the "device objects" in the CCL tests inside of the lambda
once more.
Tweaked the Alpaka code a bit, though still didn't manage to make
the NVIDIA backed jobs work correctly.
Copy link

@krasznaa
Copy link
Member Author

Please have another look. These changes are needed relatively urgently. In one way or another.

@stephenswat
Copy link
Member

These changes are needed relatively urgently.

Why? The API currently in the repository is perfectly functional already... I don't see the urgency in changing it.

@krasznaa
Copy link
Member Author

There is urgency in providing this for Alpaka and SYCL as well. As those will be needed for our tests.

I'll walk back on the interface change a little then. As implementing the feature in all backends is fairly urgent.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
alpaka Changes related to Alpaka cuda Changes related to CUDA feature New feature or request high priority sycl Changes related to SYCL
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants