feat: custom modifiers by lukasheinrich · Pull Request #1188 · scikit-hep/pyhf

lukasheinrich · 2020-11-19T17:14:03Z

Description

This is a first example on how to e.g. have a "gaussian" bump on top of a simple model
this gives a first look into how to solve #850

cc @alexander-held

#%%
import pyhf
import matplotlib.pyplot as plt
import numpy as np
# %%
class custom_modifier(object):
    def apply(self,pars):
        #shape is  (n_modifiers, n-global_samples, n_alphas, n_global_bin)
        base = np.zeros((1,len(m.config.samples),1,sum(m.config.channel_nbins.values())))

        import scipy.stats
        bins = np.linspace(-5,5,11)
        yields = 20*(scipy.stats.norm.cdf(bins[1:]) - scipy.stats.norm.cdf(bins[:-1]))

        base[0,self.config.samples.index('signal'),0,:] = yields
        return base



m = pyhf.Model(
{'channels': [{'name': 'singlechannel',
   'samples': [{'name': 'signal',
     'data': [0]*10,
     'modifiers': [{'name': 'mu', 'type': 'normfactor', 'data': None}]},
    {'name': 'background',
     'data': [10]*10,
     'modifiers': [{'name': 'uncorr_bkguncrt',
       'type': 'shapesys',
       'data': [5]*10}]}]}]},
       custom_modifiers=[custom_modifier()]
)

bp = pyhf.tensorlib.astensor(m.config.suggested_init())
bp[m.config.poi_index] = 0.2
plt.bar(range(10),m.expected_actualdata(bp))
# %%

# %%

alexander-held · 2020-11-19T17:29:17Z

A big question for solving #850 is what kind of language to use to specify custom expressions. I think for many use cases the expressions are simple enough that the language does not matter much. And it might be possible to add the language to the workspace in such a way that in the future new languages could be added?

lukasheinrich · 2020-11-19T17:34:39Z

yeah I think we need to solve 2 issues:

just technically allow the specification of new modifiers at the python level
how to codify these into JSON this is in the direction of HiFav2

a big question is also in what order these modifications happen, i.e. here we have the custom modifier an additive modifier where the total rate in pyhf is computed as

lambda = (nominal + additive.....) * multiplicative

but you could also imagine expressions that expect to operate on yields onec all other modifiers have been applied

alexander-held · 2020-11-19T17:37:57Z

I think this is technically still in the scope of normal HistFactory https://root.cern/doc/v620/classRooStats_1_1HistFactory_1_1Measurement.html#ade80531b4588d22ad5e7339ce74e7027. The version implemented there just acts as a normfactor, so this question of order is avoided. Once you allow more custom variations the problem becomes more complicated indeed.

lukasheinrich · 2020-11-19T18:51:26Z

this now supports declaring additional parameters in the modifiers.. here e.g. a mean of the gaussian

class custom_modifier(object):
    def __init__(self,name = 'name'):
        self.required_parsets = {
            'mean': {
                'paramset_type': pyhf.parameters.paramsets.unconstrained,
                'n_parameters': 1,
                'modifier': 'normfactor',
                'is_constrained': False,
                'is_shared': True,
                'inits': (0.0,),
                'bounds': ((-5, 5),),
                'fixed': False
        }}
    def apply(self,pars):
        base = np.zeros((1,len(m.config.samples),1,sum(m.config.channel_nbins.values())))

        bins = np.linspace(-5,5,11)
        mean = pars[self.config.par_slice('mean')][0]
        yields = 20*(scipy.stats.norm(loc = mean).cdf(bins[1:]) - scipy.stats.norm(loc = mean).cdf(bins[:-1]))
        base[0,self.config.samples.index('signal'),0,:] = yields
        return base

kratsg

Overall, I think I do like a bulk of the refactoring. One thing that struck me is that we still kept around the op_code stuff -- which I suppose is fine. I guess a future iteration would somehow add a modify_rate() method that would know how to modify the rate that it gets passed in... and then at some point, we'd have to deal with priority/preference in some way.. but this is a rabbit hole anyway.

One thing I appreciate a lot, that was not so clear from the body of the PR, is that you've elided a lot of the logic from the megasamples back down to each of the individual modifier modules -- so that pdf.py isn't as scary as it used to be.

kratsg · 2021-01-05T21:43:05Z

src/pyhf/pdf.py

            model (:class:`~pyhf.pdf.Model`): The Model instance.

        """
+        modifier_set = modifier_set or pyhfset


Suggested change

modifier_set = modifier_set or pyhfset

modifier_set = {**pyhfset, **(modifier_set or {})}

I would prefer that we're able to override the default set of modifiers rather than completely replacing it.

kratsg · 2021-01-05T21:45:30Z

src/pyhf/pdf.py

        # run jsonschema validation of input specification against the (provided) schema
        log.info(f"Validating spec against schema: {self.schema:s}")
-        utils.validate(self.spec, self.schema, version=self.version)
+        if validate:


this I'm unsure about. We should require users to be including their own custom schema rather than skipping the validation? but...

kratsg · 2021-01-05T21:48:15Z

src/pyhf/modifiers/__init__.py

-    'shapefactor': shapefactor,
-    'shapesys': shapesys,
-    'staterror': staterror,
+from .histosys import histosys_builder, histosys_combined


are we dropping __all__ from this? Presumably, someone would want to build their own custom modifier based off an existing one and we're not quite allowing it to be imported so easily. But maybe that's the point?

kratsg · 2021-01-05T21:51:15Z

src/pyhf/modifiers/__init__.py

+from .shapesys import shapesys_builder, shapesys_combined
+from .staterror import staterr_builder, staterror_combined
+
+pyhfset = {


not a huge fan here. we're relying on basically bookkeeping through the indices (which is scattered throughout the code) and knowing that builder is [0] and combined is [1]. We could either do a namedtuple here for each, so that modifiers['histosys'].builder and modifiers['histosys'].combined work instead, or we do pyhf.modifiers.builders['histosys'] and pyhf.modifiers.combined['histosys']. I think this was a better design.

The other thing is this is exactly the registry model (without the decoration functionality) which is fine. I guess the question is whether you want the registry to be defined per-model or per-pyhf instantiation. This change allows a different modifier_set per-model which is probably a good thing (less "global" state, and more state-less).

kratsg · 2021-01-05T21:56:20Z

src/pyhf/modifiers/histosys.py

-            'fixed': False,
-            'auxdata': (0.0,),
-        }
+def required_parset(sample_data, modifier_data):


this should likely be called _required_parset to indicate it's not a public method to be exposed in this module. Same for all others.

kratsg · 2021-01-05T22:03:31Z

tests/test_optim.py

    )
    assert result.shape[1] == 2
-    assert pytest.approx([0.0, 0.26418431]) == pyhf.tensorlib.tolist(result[:, 1])
+    assert pytest.approx([0.26418431, 0.0]) == pyhf.tensorlib.tolist(result[:, 1])


i guess can we add one more assert here on the order of the parameters in the model, just so we can keep track moving forward as well? This will future-proof in case it somehow changes again.

kratsg · 2021-01-05T22:03:58Z

tests/test_pdf.py

    }
    with pytest.raises(pyhf.exceptions.InvalidModifier):
-        pyhf.pdf._ModelConfig(spec)
+        pyhf.pdf.Model(spec, validate=False)  # don't validate to delay exception


Suggested change

pyhf.pdf.Model(spec, validate=False) # don't validate to delay exception

pyhf.pdf.Model(spec, validate=False) # skip validation exception

kratsg · 2021-01-05T22:05:02Z

tests/test_pdf.py

    model = pyhf.Model(spec, poi_name='mypoi')
-    assert model.config.suggested_fixed() == [False, True]
-    assert model.config.poi_index == 1
+    assert model.config.suggested_fixed()[model.config.par_slice('mypoi')] == [True]


the order changed again didn't it? why not just swap it around instead of combining the two? We won't catch a bug if both parameters are somehow "fixed" moving forward (so we lose coverage)

kratsg · 2021-01-05T22:05:37Z

tests/test_pdf.py

    assert len(model.config.suggested_fixed()) == 5
-    assert model.config.suggested_fixed() == [False, True, True, True, False]
-    assert model.config.poi_index == 4
+    assert model.config.suggested_fixed()[model.config.par_slice('uncorr')] == [


same comment here about catching the bug.

kratsg · 2021-01-05T22:06:08Z

tests/test_regression.py

                ]
            ),
-            rtol=1e-5,
+            rtol=2e-5,


why did this get relaxed (if it's a refactor)?

review-notebook-app · 2021-04-26T10:39:04Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

balunas · 2021-08-17T10:28:59Z

It looks like it's been quite some time since any progress was made here. It's been bumped back for a few releases, is there some technical limitation, or has it just not been prioritized? There are some analysis use cases this could be really helpful for, and we've been having to use some really inefficient work-arounds instead. Is there a way we could help?

IzaV10 · 2021-08-17T13:08:14Z

Hi,

I tried running the simple example with the "gaussian" bump on top of a simple mode. The branch was installed as pip install git+https://github.com/scikit-hep/pyhf.git@custom_modifiers but when running the code abovel (from the very first entry of the conversation) I get an error.

File "/<PATH TO VIRTUAL ENVIRONMENT>/pyhf-env/lib/python3.7/site-packages/pyhf/pdf.py", line 497, in __init__
    self.config = _ModelConfig(self.spec, **config_kwargs)
  File "/<PATH TO VIRTUAL ENVIRONMENT>/pyhf-env/lib/python3.7/site-packages/pyhf/pdf.py", line 171, in __init__
    f"Unsupported options were passed in: {list(config_kwargs.keys())}."
pyhf.exceptions.Unsupported: Unsupported options were passed in: ['custom_modifiers'].

I suspect that the way the class custom_modifier() is defined needs to be changed. Is there any documentation available on how to use this option?

Thank you for your help and time.

Cheers,
Iza

lukasheinrich · 2021-10-12T01:12:48Z

tests pass (2 tiny floating point changes... ) will clean up tomorrow

lukasheinrich · 2021-10-12T09:09:08Z

I don't really understand the CI failures - passes on my machine - any clues @matthewfeickert ?

[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci string-based constraints and more dedubpe in builders pyflakes [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci

lukasheinrich · 2021-10-13T10:52:00Z

closed in favor of #1625

matthewfeickert marked this pull request as draft November 19, 2020 17:17

lukasheinrich force-pushed the custom_modifiers branch from 950f1b5 to d85503b Compare November 19, 2020 17:24

lukasheinrich force-pushed the custom_modifiers branch from ea2d39a to 30c4d26 Compare November 19, 2020 18:49

lukasheinrich force-pushed the custom_modifiers branch 4 times, most recently from 545693d to 763c3a1 Compare November 20, 2020 14:20

lukasheinrich changed the title ~~first example~~ custom modifiers Nov 20, 2020

lukasheinrich force-pushed the custom_modifiers branch from 8379a3e to 11abd9a Compare November 25, 2020 18:06

matthewfeickert assigned lukasheinrich Nov 25, 2020

matthewfeickert added the feat/enhancement New feature or request label Nov 25, 2020

lukasheinrich force-pushed the custom_modifiers branch 15 times, most recently from 9dd6470 to 81a8734 Compare November 26, 2020 18:30

lukasheinrich force-pushed the custom_modifiers branch 4 times, most recently from b2a17cf to 30a696a Compare January 2, 2021 15:29

kratsg requested changes Jan 5, 2021

View reviewed changes

matthewfeickert force-pushed the custom_modifiers branch from fea4b76 to e2f9cba Compare February 9, 2021 05:09

alexander-held mentioned this pull request Mar 8, 2021

simplified likelihoods and pyhf #1359

Open

lukasheinrich force-pushed the custom_modifiers branch from 18c0162 to fa514ba Compare March 19, 2021 11:31

rebase

139b3fd

lukasheinrich force-pushed the custom_modifiers branch from fa514ba to 139b3fd Compare March 19, 2021 11:33

lukasheinrich added 10 commits March 19, 2021 12:58

commit

a4b911b

commit

c59ad77

restore nb

de5b610

restore nb

586e89d

get tests running agai

2d16b78

rebase

70e4ca1

go

d561981

finalizer

02a9b28

finalizer

71d240d

Merge branch 'master' into custom_modifiers

73cfc0c

lhenkelm mentioned this pull request Sep 23, 2021

Expose base_uri in utils.validate #1609

Closed

1 task

rebase

1ae91c3

tests should pass

14a3c68

[pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci string-based constraints and more dedubpe in builders pyflakes [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci

lukasheinrich mentioned this pull request Oct 12, 2021

feat: Add setup for custom modifiers #1625

Merged

1 task

	modifier_set = modifier_set or pyhfset
	modifier_set = {pyhfset, (modifier_set or {})}

	pyhf.pdf.Model(spec, validate=False) # don't validate to delay exception
	pyhf.pdf.Model(spec, validate=False) # skip validation exception

Conversation

lukasheinrich commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Uh oh!

alexander-held commented Nov 19, 2020

Uh oh!

lukasheinrich commented Nov 19, 2020

Uh oh!

alexander-held commented Nov 19, 2020

Uh oh!

lukasheinrich commented Nov 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

kratsg left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

review-notebook-app bot commented Apr 26, 2021

Uh oh!

balunas commented Aug 17, 2021

Uh oh!

IzaV10 commented Aug 17, 2021

Uh oh!

lukasheinrich commented Oct 12, 2021

Uh oh!

lukasheinrich commented Oct 12, 2021

Uh oh!

lukasheinrich commented Oct 13, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

lukasheinrich commented Nov 19, 2020 •

edited

Loading

lukasheinrich commented Nov 19, 2020 •

edited

Loading