Skip to content

Make StereoCombiner configurable in ApplyModels #2724

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

Hckjs
Copy link
Contributor

@Hckjs Hckjs commented Mar 24, 2025

Since the loaded joblib-pickled Reconstructors in ApplyModels

r = Reconstructor.read(path, parent=self, subarray=self.loader.subarray)
are already instantiated, one cannot change their traits via the config system, which leads to a not configurable StereoCombiner.

Instantiating a new Reconstructor.from_name(parent=self) and handing over its StereoCombiner would solve this Problem.

Fixes #2720

Hckjs added 2 commits March 21, 2025 17:12
 - Init new reconstructor with parent=self
 - Not working yet with DispReconstructor bc of different arguments
@Hckjs Hckjs requested review from maxnoe, kosack and LukasBeiske March 24, 2025 13:27
@Hckjs Hckjs added the bug label Mar 24, 2025

This comment has been minimized.

1 similar comment
Copy link

Passed

Analysis Details

0 Issues

  • Bug 0 Bugs
  • Vulnerability 0 Vulnerabilities
  • Code Smell 0 Code Smells

Coverage and Duplications

  • Coverage 100.00% Coverage (94.20% Estimated after merge)
  • Duplications 0.00% Duplicated Code (0.70% Estimated after merge)

Project ID: cta-observatory_ctapipe_AY52EYhuvuGcMFidNyUs

View in SonarQube

@@ -149,7 +149,25 @@ def setup(self):

self._reconstructors = []
for path in self.reconstructor_paths:
r = Reconstructor.read(path, parent=self, subarray=self.loader.subarray)
r = Reconstructor.read(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems very hacky... Isn't there a better solution for this?

@maxnoe
Copy link
Member

maxnoe commented Mar 24, 2025

I think we should rather re-evaluate how we store the models on disk / read them back.

Pickling the whole Reconstructor instances is somewhat nasty. We could also think about excluding the StereoCombiner explicitly from the pickle or add an explicit configuration parameter for overriding the StereoCombiner per Reconstructor in apply-models.

@Hckjs
Copy link
Contributor Author

Hckjs commented Mar 25, 2025

I think we should rather re-evaluate how we store the models on disk / read them back.

Pickling the whole Reconstructor instances is somewhat nasty. We could also think about excluding the StereoCombiner explicitly from the pickle or add an explicit configuration parameter for overriding the StereoCombiner per Reconstructor in apply-models.

One could maybe allow_none for the stereo_combiner_cls traits in the Reconstructor subclasses and add a

    def _init_stereo_combiner(self, parent=None, overwrite=False):
        if self.stereo_combiner is None or overwrite:
            self.stereo_combiner = StereoCombiner.from_name(
                self.stereo_combiner_cls,
                prefix=self.prefix,
                property=self.property,
                parent=parent,
            )

which is called on its __init__, making it possible to not store the StereoCombiner but instead init the Combiner with

r._init_stereo_combiner(parent=self, overwrite=self.overwrite_stereo_combiner)

in ApplyModels. This would also allow to overwrite it.

Can't really think of a better solution yet...

@kosack
Copy link
Member

kosack commented Mar 28, 2025

I never liked that the whole reconstructor was pickled in the first place - it's easy, but also makes it very hard to customize how we store the models in the future, i.e. if we want to use some standard format. Decoupling how we store the models from the Reconstructor would be the best way, but that also requires a bit of thought and a lot of refactoring. E.g. how to know which reconstructor to construct before loading the model? How to store the parameter list?

One way would be to change what is actually serialized in Reconstructor.write() to be just the model and class name rather than the whole thing, and then in load() constructing a new class from the name and setting the model. Perhaps that could be done by using Reconstructor.__get_state__(), Reconstructr.__set_state__() (which is the protocol for setting what gets pickled in a class), but I'm not sure.

@LukasBeiske
Copy link
Contributor

LukasBeiske commented Apr 3, 2025

I never liked that the whole reconstructor was pickled in the first place - it's easy, but also makes it very hard to customize how we store the models in the future, i.e. if we want to use some standard format. Decoupling how we store the models from the Reconstructor would be the best way, but that also requires a bit of thought and a lot of refactoring. E.g. how to know which reconstructor to construct before loading the model? How to store the parameter list?

One way would be to change what is actually serialized in Reconstructor.write() to be just the model and class name rather than the whole thing, and then in load() constructing a new class from the name and setting the model. Perhaps that could be done by using Reconstructor.__get_state__(), Reconstructr.__set_state__() (which is the protocol for setting what gets pickled in a class), but I'm not sure.

How about, instead of pickling the whole Reconstructor, we pickle a dictionary containing the class name, the model(s), and all the other necessary configuration options.
Since some of these config options (e.g. log_target for regressors) should not be overwritten when applying the model, we would have to add checks for this when "loading" the model, but otherwise constructing a new class with the models and config options from that dictionary should not be a problem, right?
The other option would be to prevent overwriting any config options by default and add explicit config options to ctapipe-apply-models to change e.g. the StereoCombiner, like Max suggested.

I know, that this is not much different from pickling the whole Reconstructor. This also would not completely decouple the storing of the models from the reconstructor, since the content of this dictionary would differ for different subclasses of Reconstructor, but it would be a start and solve this issue for now.
Maybe it could later be replaced by a class defining the output of all the training tools similar to OptimizationResult for the cut optimization tool.

@maxnoe
Copy link
Member

maxnoe commented Apr 4, 2025

The dict is good I think, like that we can also attach a meta filed where we put in the reference metadata.

@LukasBeiske
Copy link
Contributor

The dict is good I think, like that we can also attach a meta filed where we put in the reference metadata.

Ok, I'll get on this and open a second PR, since this is a bit wider in scope than this PR.

@maxnoe
Copy link
Member

maxnoe commented Apr 4, 2025

Maybe it's also time to give https://onnx.ai/sklearn-onnx/ a shot again and not rely on pickle.

@LukasBeiske LukasBeiske mentioned this pull request Jun 12, 2025
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Config values not applied to loaded Reconstructor models in ApplyModels
4 participants