Use detray geometry on GPUs #23
Replies: 4 comments 22 replies
-
Thanks for the message Andi! Let me have a look at the code today, and give you feedback once I have a "fuller" picture. 😄 At the same time let me include @stephenswat in this. Could you give him access to the code as well? Since I very much want to involve him in these developments. 😉 |
Beta Was this translation helpful? Give feedback.
-
So... I thought the fastest way to describe how I believe the code will need to be structured, if I throw an example together. You can find it here: https://github.com/krasznaa/detray_data_model The main features are the following:
Note that the code in that repository is not representative of how the code providing us with the "ultimate performance" should look like. I took some shortcuts with the "managed memory" feature of CUDA, to not have to write too much code at first. But based on previous experience, this design should work well for us for the setup with manual memory management as well. (When we control ourselves exactly when memory gets copied where. Which on the whole is more performant than letting CUDA figure this out using page faults at runtime.) Cheers, P.S. Of course the allocator should also not just use CUDA memory allocations directly, but go through an intermediate layer, like what we're looking at with @stephenswat. (https://nvlabs.github.io/cub/) |
Beta Was this translation helpful? Give feedback.
-
@asalzburger, I really came to like my setup for specifying the vector type that a template should use. I really think that we should abandon the current setup where the vector types are specified through a global typedef. I don't think anybody would disagree with this, I was just wondering who should undertake this. If you guys want to give this a go, I'm very happy to let you. Otherwise I'll take a crack at it myself. I think by now I have a good enough idea of how I'd want to update the |
Beta Was this translation helpful? Give feedback.
-
As a separate question: Other than for debugging, do we need the As you may guess, If those are particularly useful in the host code, I'll propose a way for only using/instantiating them on the host. But if they are not really needed, we may as well just remove them. |
Beta Was this translation helpful? Give feedback.
-
The
detector
class implements a geometry without any polymorphism, the different objects are grouped intotuples
and accessed via atyped index
, i.e. the one index for the type of object, one index for the position within thetuple
column.However, there is still the usage of
std::containers
for structuring the geometry, all can be found in thecontainers.hpp
definition file.What would we need to change to be able to use this geometry on the GPU?
cuda_geometry
with different container classes?There is a test that creates a simple pixel geometry, and we will make reading in the ITk geometry and or TrackML/OpenDataDetector geometry available.
@krasznaa @niermann999 @XiaocongAi - what are your thoughts on that?
Beta Was this translation helpful? Give feedback.
All reactions