-
Notifications
You must be signed in to change notification settings - Fork 28
Use std::vector
instead of std::map
for reading or writing an arbitrary number of collections.
#201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
31b82ff
to
fec79be
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a quick look. There are still a few occurrences of 'map' where 'vector' should be used
Functions detail::readMapInputs
and detail::putMapOutputs
could also be renamed since they operate on vectors
I think const *
are acceptable
In case the information about collection keys will be needed:
In Gaudi there isinputLocation()
and outputLocation()
(defined in DataHandleMixing
) that can be used to get keys of inputs and outputs (like tuple either by index or type if unique). From what I seen they are not accessible from k4FWCore functional algorithms. It might be good idea to add something similar here.
Alternatively something like std::get<n>(m_outputs).at(m).fullKey().key()
could be used to get key of n
-th output and m
-th collection (given n-th output is a vector of collections)
0b75488
to
2d17f31
Compare
There is also |
I added a |
Any more comments or comments about the functions to get the locations? @m-fila |
Sorry I was busy with other things, I'll take another look |
} | ||
for (auto& [key, val] : input) { | ||
for (int i = 0; i < 3; i++) { | ||
if (inputLocations(0)[i] != "MCParticles" + std::to_string(i)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no easy way to have inputLocations
take the key from the KeyValues
above, right? I.e. such that something like this would becom possible
if (inputLocations("InputCollection")[i] == "MCParticles" + std::to_string(i)) { /**/ }
It doesn't really matter here, but if there are multiple input variable length inputs, this would make it much harder to break things, I think. However, I am not sure how often that actually is the case in real world uses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is:
- either by property (returning
std::vector<std::string>>
)auto inputLocations(const std::string& name) const { Gaudi::Property<std::vector<std::string>> tmp; tmp.assign(this->getProperty(name)); return tmp.value(); }
- or from
m_inputlocations
(returing view)
auto intpuLocations(const std::string& name) const {
auto it = std::ranges::find_if(m_inputLocations, [&name](const auto& prop) { return prop.name() == name; });
if (it == m_inputLocations.end()) {
throw GaudiException("Input named " + name + " not found", "Consumer", StatusCode::FAILURE);
}
return *it | std::views::transform([](const DataObjID& id) -> const auto& { return id.key(); });
}
It could be also nice to get by type (if it's unique, there is a tuple underneath) but that seems to be more complicated as detail::transformType
transforms all the collection types to podio::CollectionBase
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added functions to get by name. Getting by type is not trivial since one still has to get the index to find where the vector with the names is... so maybe if someone asks for it? By index and name should be plenty of options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the use case for getting this by type is limited(?). i would assume that in that case you don't really care about the collection name since it's only one in any case. The only reason I could think of would be for log messages, but in that case, one of the overloads here with index or name, should also work, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree, it's complicated to implement and there are already ways to achieve the same result so I wouldn't bother with it now. I mentioned it mostly for completeness since they have it in Gaudi.
The only reason I could think of would be for log messages, but in that case, one of the overloads here with index or name, should also work, right?
I could think of something like this: when there multiple arguments like here or more
using FloatColl = std::vector<const podio::UserDataCollection<float>*>;
using ParticleColl = std::vector<const edm4hep::MCParticleCollection*>;
using SimTrackerHitColl = std::vector<const edm4hep::SimTrackerHitCollection*>;
using TrackerHitColl = std::vector<const edm4hep::TrackerHit3DCollection*>;
using TrackColl = std::vector<const edm4hep::TrackCollection*>;
using retType = std::tuple<std::vector<podio::UserDataCollection<float>>, std::vector<edm4hep::MCParticleCollection>,
std::vector<edm4hep::MCParticleCollection>, std::vector<edm4hep::SimTrackerHitCollection>,
std::vector<edm4hep::TrackerHit3DCollection>, std::vector<edm4hep::TrackCollection>>;
retType operator()(const FloatColl& floatVec, const ParticleColl& particlesVec,
const SimTrackerHitColl& simTrackerHitVec, const TrackerHitColl& trackerHitVec,
const TrackColl& trackVec) const
I think it'd be nice to avoid counting the arguments to write outputLocations(4)
outputLocations(5)
or scrolling to the constructor to find the name in KeyValues
, and instead just write outputLocations<std::vector<edm4hep::TrackCollection>>()
when I want to get the name for the locations of that "vector of track-collection" output
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion about this, it can be useful. But so far I think we'll have few algorithms with any number of inputs or outputs so the reach will be small.
arbitrary number of collections. The current implementation with std::map is not easy to use. The collections get ordered alphabetically in std::map so the order set in the steering file is lost, which means if there is some kind of mapping (like one output collection for each input collection) then one has to reproduce this map in the algorithm to make sure it's correct.
I'll merge this later today, #209 has to be rebased on top of this. Note to myself: fix formatting, |
The current implementation with std::map is not easy to use. The collections get ordered alphabetically in std::map so the order set in the steering file is lost, which means if there is some kind of mapping (like one output collection for each input collection) then one has to reproduce this map in the algorithm and the way I found to do it is to have another property with the input and output names again. For output one creates the collections and then puts them in a vector, I think it can't get simpler than that. For input, since there can't be vectors to references one gets a pointer to the collection, which is not ideal but not so bad. With a custom type it would be possible to have a vector that when using [] can get a reference but then that introduces a custom type which I don't like.
BEGINRELEASENOTES
inputLocations
andoutputLocations
methods to be able to retrieve the locations set in the steering file.ENDRELEASENOTES