-
Notifications
You must be signed in to change notification settings - Fork 53
Device Cluster Reconstruction, main branch (2025.06.19.) #1028
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Device Cluster Reconstruction, main branch (2025.06.19.) #1028
Conversation
So that for instance a simple boolean flag could be an algorithm argument.
And then updated traccc::cuda::clusterization_algorithm to make use of it, simplifying the algorithm's interface a great deal.
Following the code structure used by cuda::clusterization_algorithm. Also updated all client code to the new API.
Following the code structure used by cuda::clusterization_algorithm. Also updated all client code to the new API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't really bother reading past the Alpaka code, but could you elaborate why exactly you think this makes the code "simpler"? The net PR delta is almost +100 lines and this change makes the API for users more obtuse to use, who in the existing implementation know exactly that if they pass the right tag, they get the desired result. Returning an optional value leaves open the semantic to return an non-extant value even if the user requests the cluster data to be delivered.
In my eyes, this just makes the code less understandable for no good reason, so I'd like to understand what the benefit of this approach is.
vecmem::data::vector_buffer<device::details::index_t> m_f_backup, | ||
m_gf_backup; | ||
vecmem::data::vector_buffer<unsigned char> m_adjc_backup; | ||
vecmem::data::vector_buffer<device::details::index_t> m_adjv_backup; | ||
vecmem::unique_alloc_ptr<unsigned int> m_backup_mutex; | ||
mutable std::once_flag m_setup_once; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit, yes. I also made the algorithm receive a queue from the outside.
return { | ||
.measurements = {}, | ||
.clusters = | ||
(reconstruct_clusters | ||
? std::optional< | ||
edm::silicon_cluster_collection:: | ||
buffer>{edm::silicon_cluster_collection:: | ||
buffer{}} | ||
: std::optional<edm::silicon_cluster_collection::buffer>{ | ||
std::nullopt})}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's got to be a way to make this more readable. 😟
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is pretty terrible. Unfortunately ternary operators don't seem to mix too well with std::optional
. 😦 But yes, probably something better could still be done...
Ideally I would've liked to write:
return { | |
.measurements = {}, | |
.clusters = | |
(reconstruct_clusters | |
? std::optional< | |
edm::silicon_cluster_collection:: | |
buffer>{edm::silicon_cluster_collection:: | |
buffer{}} | |
: std::optional<edm::silicon_cluster_collection::buffer>{ | |
std::nullopt})}; | |
return { | |
.measurements = {}, | |
.clusters = (reconstruct_clusters ? {} : std::nullopt)}; |
But of course that didn't work... 😦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The +100 lines comes mainly from the fact that I'm teaching the Alpaka and SYCL algorithms how to do a new thing. (Plus I'm adding code to the unit tests to check the new behaviour.) The changeset on the CUDA code should be a net negative if you check.
I'm not convinced about the interface either. But putting 4 functions into all 3 clusterization algorithms seems to be an overkill for me.
I'm mainly thinking that client code will not just want to either reconstruct clusters or not reconstruct clusters. I think in the Athena code we will want to turn cluster reconstruction on or off through configuration options in the end. And having to write separate code paths in that code to achieve this, does not seem great. (I don't see us writing generic code for this through some clever templating there.)
vecmem::data::vector_buffer<device::details::index_t> m_f_backup, | ||
m_gf_backup; | ||
vecmem::data::vector_buffer<unsigned char> m_adjc_backup; | ||
vecmem::data::vector_buffer<device::details::index_t> m_adjv_backup; | ||
vecmem::unique_alloc_ptr<unsigned int> m_backup_mutex; | ||
mutable std::once_flag m_setup_once; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little bit, yes. I also made the algorithm receive a queue from the outside.
return { | ||
.measurements = {}, | ||
.clusters = | ||
(reconstruct_clusters | ||
? std::optional< | ||
edm::silicon_cluster_collection:: | ||
buffer>{edm::silicon_cluster_collection:: | ||
buffer{}} | ||
: std::optional<edm::silicon_cluster_collection::buffer>{ | ||
std::nullopt})}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree. This is pretty terrible. Unfortunately ternary operators don't seem to mix too well with std::optional
. 😦 But yes, probably something better could still be done...
Ideally I would've liked to write:
return { | |
.measurements = {}, | |
.clusters = | |
(reconstruct_clusters | |
? std::optional< | |
edm::silicon_cluster_collection:: | |
buffer>{edm::silicon_cluster_collection:: | |
buffer{}} | |
: std::optional<edm::silicon_cluster_collection::buffer>{ | |
std::nullopt})}; | |
return { | |
.measurements = {}, | |
.clusters = (reconstruct_clusters ? {} : std::nullopt)}; |
But of course that didn't work... 😦
Made the early returns from clusterization easier to read. Hid the "device objects" in the CCL tests inside of the lambda once more. Tweaked the Alpaka code a bit, though still didn't manage to make the NVIDIA backed jobs work correctly.
|
Please have another look. These changes are needed relatively urgently. In one way or another. |
Why? The API currently in the repository is perfectly functional already... I don't see the urgency in changing it. |
There is urgency in providing this for Alpaka and SYCL as well. As those will be needed for our tests. I'll walk back on the interface change a little then. As implementing the feature in all backends is fairly urgent. |
Following up from #1023, this is how I believe the clusterization algorithm should provide its results. And I'm very open to a discussion on this.
While I generally liked the idea of having different functions for reconstructing just the measurements or the measurements plus the clusters, I think in the end that just makes the code more complicated than it needs to be. I propose that we just have a single interface for the device clusterization algorithms. One which returns an optional cluster collection buffer. Behind the scenes this was the implementation in
cuda::clusterization_algorithm
already. And I think this is a functional interface towards the users as well. 🤔After updating the CUDA algorithm, I added the functionality of reconstructing cell clusters to the Alpaka and SYCL algorithms as well.
Unfortunately however the Alpaka algorithm, on an NVIDIA backend, is not behaving well at the moment. 😭 The unit test reports that a random number of the tests fail to return the correct (number of) clusters from the algorithm. And so far I didn't manage to find yet what is causing this issue. 😦
Still, even with that open issue, I wanted to let people see my proposal already. As there may be a discussion about the user API...
As before, pinging @CrossR and @StewMH for their opinions.
P.S. Almost forgot... I also modified
traccc::algorithm
to allow fundamental arguments to be passed by value to an algorithm. As I really wanted to be able to pass abool
to the clusterization algorithms...