Use std::vector instead of std::map for reading or writing an arbitrary number of collections.#201
Conversation
31b82ff to
fec79be
Compare
m-fila
left a comment
There was a problem hiding this comment.
I had a quick look. There are still a few occurrences of 'map' where 'vector' should be used
Functions detail::readMapInputs and detail::putMapOutputs could also be renamed since they operate on vectors
I think const * are acceptable
In case the information about collection keys will be needed:
In Gaudi there isinputLocation() and outputLocation() (defined in DataHandleMixing) that can be used to get keys of inputs and outputs (like tuple either by index or type if unique). From what I seen they are not accessible from k4FWCore functional algorithms. It might be good idea to add something similar here.
Alternatively something like std::get<n>(m_outputs).at(m).fullKey().key() could be used to get key of n-th output and m-th collection (given n-th output is a vector of collections)
0b75488 to
2d17f31
Compare
There is also |
|
I added a |
|
Any more comments or comments about the functions to get the locations? @m-fila |
Sorry I was busy with other things, I'll take another look |
| } | ||
| for (auto& [key, val] : input) { | ||
| for (int i = 0; i < 3; i++) { | ||
| if (inputLocations(0)[i] != "MCParticles" + std::to_string(i)) { |
There was a problem hiding this comment.
There is no easy way to have inputLocations take the key from the KeyValues above, right? I.e. such that something like this would becom possible
if (inputLocations("InputCollection")[i] == "MCParticles" + std::to_string(i)) { /**/ }It doesn't really matter here, but if there are multiple input variable length inputs, this would make it much harder to break things, I think. However, I am not sure how often that actually is the case in real world uses.
There was a problem hiding this comment.
There is:
- either by property (returning
std::vector<std::string>>)auto inputLocations(const std::string& name) const { Gaudi::Property<std::vector<std::string>> tmp; tmp.assign(this->getProperty(name)); return tmp.value(); }
- or from
m_inputlocations(returing view)
auto intpuLocations(const std::string& name) const {
auto it = std::ranges::find_if(m_inputLocations, [&name](const auto& prop) { return prop.name() == name; });
if (it == m_inputLocations.end()) {
throw GaudiException("Input named " + name + " not found", "Consumer", StatusCode::FAILURE);
}
return *it | std::views::transform([](const DataObjID& id) -> const auto& { return id.key(); });
}It could be also nice to get by type (if it's unique, there is a tuple underneath) but that seems to be more complicated as detail::transformType transforms all the collection types to podio::CollectionBase
There was a problem hiding this comment.
Added functions to get by name. Getting by type is not trivial since one still has to get the index to find where the vector with the names is... so maybe if someone asks for it? By index and name should be plenty of options.
There was a problem hiding this comment.
I think that the use case for getting this by type is limited(?). i would assume that in that case you don't really care about the collection name since it's only one in any case. The only reason I could think of would be for log messages, but in that case, one of the overloads here with index or name, should also work, right?
There was a problem hiding this comment.
Yes, I agree, it's complicated to implement and there are already ways to achieve the same result so I wouldn't bother with it now. I mentioned it mostly for completeness since they have it in Gaudi.
The only reason I could think of would be for log messages, but in that case, one of the overloads here with index or name, should also work, right?
I could think of something like this: when there multiple arguments like here or more
using FloatColl = std::vector<const podio::UserDataCollection<float>*>;
using ParticleColl = std::vector<const edm4hep::MCParticleCollection*>;
using SimTrackerHitColl = std::vector<const edm4hep::SimTrackerHitCollection*>;
using TrackerHitColl = std::vector<const edm4hep::TrackerHit3DCollection*>;
using TrackColl = std::vector<const edm4hep::TrackCollection*>;
using retType = std::tuple<std::vector<podio::UserDataCollection<float>>, std::vector<edm4hep::MCParticleCollection>,
std::vector<edm4hep::MCParticleCollection>, std::vector<edm4hep::SimTrackerHitCollection>,
std::vector<edm4hep::TrackerHit3DCollection>, std::vector<edm4hep::TrackCollection>>;
retType operator()(const FloatColl& floatVec, const ParticleColl& particlesVec,
const SimTrackerHitColl& simTrackerHitVec, const TrackerHitColl& trackerHitVec,
const TrackColl& trackVec) constI think it'd be nice to avoid counting the arguments to write outputLocations(4)outputLocations(5) or scrolling to the constructor to find the name in KeyValues, and instead just write outputLocations<std::vector<edm4hep::TrackCollection>>() when I want to get the name for the locations of that "vector of track-collection" output
There was a problem hiding this comment.
I don't have a strong opinion about this, it can be useful. But so far I think we'll have few algorithms with any number of inputs or outputs so the reach will be small.
arbitrary number of collections. The current implementation with std::map is not easy to use. The collections get ordered alphabetically in std::map so the order set in the steering file is lost, which means if there is some kind of mapping (like one output collection for each input collection) then one has to reproduce this map in the algorithm to make sure it's correct.
|
I'll merge this later today, #209 has to be rebased on top of this. Note to myself: fix formatting, |
The current implementation with std::map is not easy to use. The collections get ordered alphabetically in std::map so the order set in the steering file is lost, which means if there is some kind of mapping (like one output collection for each input collection) then one has to reproduce this map in the algorithm and the way I found to do it is to have another property with the input and output names again. For output one creates the collections and then puts them in a vector, I think it can't get simpler than that. For input, since there can't be vectors to references one gets a pointer to the collection, which is not ideal but not so bad. With a custom type it would be possible to have a vector that when using [] can get a reference but then that introduces a custom type which I don't like.
BEGINRELEASENOTES
inputLocationsandoutputLocationsmethods to be able to retrieve the locations set in the steering file.ENDRELEASENOTES