switch to const ref for CRTP then allow lambda to copy capture #509
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
PR Summary
@Yannicked reported in issue #508 that the copies int he base class were impacting the vector API performance. One copy is necessary (on GPUs) as the EOS object, rather than the pointer, must be captured by value. This requires either pulling out the object or capturing
*this
with, e.g.,KOKKOS_CLASS_LAMBDA
.However, we are currently performing two copies: 1 when we perform
CRTP
to typecast the object to the child type, and one when we capture. It turns out the first copy can be eliminated. We can simply create the CRTP by reference, as[=]
in the lambda implies assignment operator and so the underlying object is captured by value even if it is a reference in scope. See here for an example.This MR eliminates that extraneous copy.
PR Checklist
make format
command after configuring withcmake
.If preparing for a new release, in addition please check the following:
when='@main'
dependencies are updated to the release version in the package.py