New code to remove unwanted histograms from root file #295
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When combining datacards (that is everywhere, for example when we have signal regions and control regions in fits), in the combined datacard if a nuisance is both "lnN" and "shape" (for example in one datacard is "lnN" and in another datacard is "shape"), the nuisance is renamed "shape?" [1]. This new nuisance naming triggers the following procedure in combine [2], in short:
The issue is that when we run "AsLnN" in mkdatacard.py [3] we calculate the integral effect up/down, the value is added in the txt file of the datacard, but the hisogram with up/down variation is kept in the root file. Thus the histograms are used (even worse, because the up and down variations will be interpreted as N-sigmas variations as defined in the datacard, e.g. 0.95/1.34 --> 0.95 sigma down variation and 1.34 up variation, or the combine code would actually crash).
This simple cleaning code, to backup the already available root file and then rename the not needed histograms in the root file, to make sure this issue is not there.
[1] https://github.com/cms-analysis/HiggsAnalysis-CombinedLimit/blob/f5e201829ebeab08a1d25e7272a30e4a549a5a8e/scripts/combineCards.py#L249
[2] https://cms-analysis.github.io/HiggsAnalysis-CombinedLimit/latest/part2/settinguptheanalysis/?h=shape%3F#template-shape-uncertainties
If you have a nuisance parameter that has shape effects on some processes (using shape) and rate effects on other processes (using lnN) you should use a single line for the systematic uncertainty with shape?. This will tell Combine to fist look for Up/Down systematic templates for that process and if it doesnt find them, it will interpret the number that you put for the process as a lnN instead.
[3] https://github.com/latinos/LatinoAnalysis/blob/master/ShapeAnalysis/scripts/mkDatacards.py#L361