Conversation
950f1b5 to
d85503b
Compare
|
A big question for solving #850 is what kind of language to use to specify custom expressions. I think for many use cases the expressions are simple enough that the language does not matter much. And it might be possible to add the language to the workspace in such a way that in the future new languages could be added? |
|
yeah I think we need to solve 2 issues:
a big question is also in what order these modifications happen, i.e. here we have the custom modifier an additive modifier where the total rate in pyhf is computed as
but you could also imagine expressions that expect to operate on yields onec all other modifiers have been applied |
|
I think this is technically still in the scope of normal HistFactory https://root.cern/doc/v620/classRooStats_1_1HistFactory_1_1Measurement.html#ade80531b4588d22ad5e7339ce74e7027. The version implemented there just acts as a normfactor, so this question of order is avoided. Once you allow more custom variations the problem becomes more complicated indeed. |
ea2d39a to
30c4d26
Compare
|
this now supports declaring additional parameters in the modifiers.. here e.g. a mean of the gaussian |
545693d to
763c3a1
Compare
8379a3e to
11abd9a
Compare
9dd6470 to
81a8734
Compare
b2a17cf to
30a696a
Compare
kratsg
left a comment
There was a problem hiding this comment.
Overall, I think I do like a bulk of the refactoring. One thing that struck me is that we still kept around the op_code stuff -- which I suppose is fine. I guess a future iteration would somehow add a modify_rate() method that would know how to modify the rate that it gets passed in... and then at some point, we'd have to deal with priority/preference in some way.. but this is a rabbit hole anyway.
One thing I appreciate a lot, that was not so clear from the body of the PR, is that you've elided a lot of the logic from the megasamples back down to each of the individual modifier modules -- so that pdf.py isn't as scary as it used to be.
| model (:class:`~pyhf.pdf.Model`): The Model instance. | ||
|
|
||
| """ | ||
| modifier_set = modifier_set or pyhfset |
There was a problem hiding this comment.
| modifier_set = modifier_set or pyhfset | |
| modifier_set = {**pyhfset, **(modifier_set or {})} |
I would prefer that we're able to override the default set of modifiers rather than completely replacing it.
| # run jsonschema validation of input specification against the (provided) schema | ||
| log.info(f"Validating spec against schema: {self.schema:s}") | ||
| utils.validate(self.spec, self.schema, version=self.version) | ||
| if validate: |
There was a problem hiding this comment.
this I'm unsure about. We should require users to be including their own custom schema rather than skipping the validation? but...
| 'shapefactor': shapefactor, | ||
| 'shapesys': shapesys, | ||
| 'staterror': staterror, | ||
| from .histosys import histosys_builder, histosys_combined |
There was a problem hiding this comment.
are we dropping __all__ from this? Presumably, someone would want to build their own custom modifier based off an existing one and we're not quite allowing it to be imported so easily. But maybe that's the point?
| from .shapesys import shapesys_builder, shapesys_combined | ||
| from .staterror import staterr_builder, staterror_combined | ||
|
|
||
| pyhfset = { |
There was a problem hiding this comment.
not a huge fan here. we're relying on basically bookkeeping through the indices (which is scattered throughout the code) and knowing that builder is [0] and combined is [1]. We could either do a namedtuple here for each, so that modifiers['histosys'].builder and modifiers['histosys'].combined work instead, or we do pyhf.modifiers.builders['histosys'] and pyhf.modifiers.combined['histosys']. I think this was a better design.
There was a problem hiding this comment.
The other thing is this is exactly the registry model (without the decoration functionality) which is fine. I guess the question is whether you want the registry to be defined per-model or per-pyhf instantiation. This change allows a different modifier_set per-model which is probably a good thing (less "global" state, and more state-less).
| 'fixed': False, | ||
| 'auxdata': (0.0,), | ||
| } | ||
| def required_parset(sample_data, modifier_data): |
There was a problem hiding this comment.
this should likely be called _required_parset to indicate it's not a public method to be exposed in this module. Same for all others.
| ) | ||
| assert result.shape[1] == 2 | ||
| assert pytest.approx([0.0, 0.26418431]) == pyhf.tensorlib.tolist(result[:, 1]) | ||
| assert pytest.approx([0.26418431, 0.0]) == pyhf.tensorlib.tolist(result[:, 1]) |
There was a problem hiding this comment.
i guess can we add one more assert here on the order of the parameters in the model, just so we can keep track moving forward as well? This will future-proof in case it somehow changes again.
| } | ||
| with pytest.raises(pyhf.exceptions.InvalidModifier): | ||
| pyhf.pdf._ModelConfig(spec) | ||
| pyhf.pdf.Model(spec, validate=False) # don't validate to delay exception |
There was a problem hiding this comment.
| pyhf.pdf.Model(spec, validate=False) # don't validate to delay exception | |
| pyhf.pdf.Model(spec, validate=False) # skip validation exception |
| model = pyhf.Model(spec, poi_name='mypoi') | ||
| assert model.config.suggested_fixed() == [False, True] | ||
| assert model.config.poi_index == 1 | ||
| assert model.config.suggested_fixed()[model.config.par_slice('mypoi')] == [True] |
There was a problem hiding this comment.
the order changed again didn't it? why not just swap it around instead of combining the two? We won't catch a bug if both parameters are somehow "fixed" moving forward (so we lose coverage)
| assert len(model.config.suggested_fixed()) == 5 | ||
| assert model.config.suggested_fixed() == [False, True, True, True, False] | ||
| assert model.config.poi_index == 4 | ||
| assert model.config.suggested_fixed()[model.config.par_slice('uncorr')] == [ |
There was a problem hiding this comment.
same comment here about catching the bug.
| ] | ||
| ), | ||
| rtol=1e-5, | ||
| rtol=2e-5, |
There was a problem hiding this comment.
why did this get relaxed (if it's a refactor)?
fea4b76 to
e2f9cba
Compare
18c0162 to
fa514ba
Compare
fa514ba to
139b3fd
Compare
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
|
It looks like it's been quite some time since any progress was made here. It's been bumped back for a few releases, is there some technical limitation, or has it just not been prioritized? There are some analysis use cases this could be really helpful for, and we've been having to use some really inefficient work-arounds instead. Is there a way we could help? |
|
Hi, I tried running the simple example with the "gaussian" bump on top of a simple mode. The branch was installed as I suspect that the way the Thank you for your help and time. Cheers, |
|
tests pass (2 tiny floating point changes... ) will clean up tomorrow |
|
I don't really understand the CI failures - passes on my machine - any clues @matthewfeickert ? |
[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci string-based constraints and more dedubpe in builders pyflakes [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci
|
closed in favor of #1625 |


Description
This is a first example on how to e.g. have a "gaussian" bump on top of a simple model
this gives a first look into how to solve #850
cc @alexander-held