-
Notifications
You must be signed in to change notification settings - Fork 10
Configuration for 2024, 2023, and 2022 productions #76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
Hi @BlancoFS, Thank you for preparing this PR. I have a couple of minor comments:
|
setup.cfg
Outdated
| processor = | ||
| #CMSJMECalculators @ git+https://gitlab.cern.ch/cms-analysis/general/CMSJMECalculators.git | ||
| #CMSJMECalculators @ git+https://gitlab.cern.ch/mlizzo/CMSJMECalculators.git | ||
| CMSJMECalculators @ git+https://gitlab.cern.ch/sblancof/CMSJMECalculators.git |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Sergio, thanks for putting this PR together. As you know, I am a bit involved in how we treat jets, this fork of CMSJMECalculators can handle the extra phi dependence that I believe is present in the new version corrections in 23/23BPix, right? What is the main difference to the officiali tool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. The difference with the previous version is the addition of phi. It was not developed in the main branch at the time I did it. If you tell me that this is now implemented in central CMSJMECalculator, then maybe we can test if it works and switch again to the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am like 99% sure that it does, I think we can test it easily. For this, we can use the 'TTTo2L2Nu_10k_nano' samples, since it is only 10k events and it should be quick to compare the outputs, what do you think?
| } , | ||
| 'wpSF': { | ||
| '1-1' : ["2022Re-recoBCD", "Electron-ID-SF", "passingMVA90", 'data/scale_factor/Full2022v12/electron.json'], ### Correctionlib parameters: [Era, Key, WP, path to json]. | ||
| '1-1' : ["2022Re-recoBCD", "Electron-ID-SF", "wp90iso", '/cvmfs/cms-griddata.cern.ch/cat/metadata/EGM/Run3-22CDSep23-Summer22-NanoAODv12/latest/electron.json.gz'], ### Correctionlib parameters: [Era, Key, WP, path to json]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we switching to the POG wpSF? I think it makes sense but why weren't we doing this already? Sorry if I don't remember.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, POGs take more care than us on developing their central SFs. They also have more experience, so I believe it may be beneficial for us to use theirs whenever available. In my opinion, this is particularly useful for electrons, where the fits are much trickier than for muons. The scale factors for electrons tend to have more fluctuations. For muons, it's almost the same (We have more expertise there as well, since we work on the official SF).
| #"ROOT::VecOps::abs(Electron_eta) < 2.5", | ||
| #"Electron_mvaIso_WP80", | ||
| #"Electron_convVeto", | ||
| "ROOT::VecOps::abs(Electron_eta) < 2.5", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I while ago I added this POG wp to test the compatibility of mkShapes with other frameworks. Since the test is done on basically any electron with pt>10 GeV (link), I think this can be left without additional cuts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, I didn't understand the naming since it said wp80iso_POG, but the cut was not applied. That's why I removed the comments. Do you think we can change the naming if you want to preserve the looser cuts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can change the naming, actually I was planning to start working again on the recipes. I think we can name everything that is used for those tests as Electron/Muon/Jet_recipes, at least this should be strange enough for the other analyzers to not use it without asking what it is. Also,since it is already a bit of work, I can manage it at later stage, as you prefer!
| source `pwd`/myenv/bin/activate | ||
| export STARTPATH=`pwd`/start.sh | ||
| export PYTHONPATH=`pwd`/myenv/lib64/python3.9/site-packages:\$PYTHONPATH | ||
| export PYTHONPATH=`pwd`/myenv/lib64/python3.11/site-packages:\$PYTHONPATH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is still the case but we added a couple of lines here since people had problems getting their voms certificate while inside di env. Maybe is worth checking, you didn't have any problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that was an issue related to an updated software in the recent LCG machines. It should be fixed now without the need to touch anything there. In fact, I can activate voms now without any trouble.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this different from the "standard" EleScare.cc macro? I think also here we have the extra phi dependence? Sorry if I am missing someting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there was a small difference in the inputs of the latest scale and smearing JSONs. At least by now, there is no phi dependence in the electron corrections (link) for 2024. Maybe in the future they develop it, but it's not present now
|
|
||
| Samples['WGtoLNuG-1J_PTG400to600'] = { | ||
| 'nanoAOD' :'/WGtoLNuG-1Jets_PTG-400to600_TuneCP5_13p6TeV_amcatnloFXFX-pythia8/Run3Summer22NanoAODv12-130X_mcRun3_2022_realistic_v5-v3/NANOAODSIM' | ||
| 'nanoAOD' :'/WGtoLNuG-1Jets_PTG-400to600_TuneCP5_13p6TeV_amcatnloFXFX-pythia8/Run3Summer22NanoAODv12-130X_mcRun3_2022_realistic_v5-v1/NANOAODSIM' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why switching to an older version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is most probably an error on my side, good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
| Samples['ZG2JtoG2L2J'] = { | ||
| 'nanoAOD' :'/ZG2JtoG2L2J_EWK_MLL-50_MJJ-120_TuneCP5_withDipoleRecoil_13p6TeV_madgraph-pythia8/Run3Summer22EENanoAODv12-130X_mcRun3_2022_realistic_postEE_v6-v3/NANOAODSIM' | ||
| } | ||
| Samples['ZG'] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the ZG samples we are interested and that we will treat as our Vg* is DYGto2LG-1Jets*. I don't know if I am correct, maybe the others can help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realize that the naming convention also changed here with respect to Run II UL. It would be great if anyone can confirm which one to use
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe @mlizzo can help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes the DYGto2LG-1Jets* sample is the main one, the EWK process can enter in the 2-jets phase space when targeting same-flavour final states. We are still waiting for these samples to be reproduced from the common background team, as they are currently binned in PTG and some level of disagreement was observed when stitching them together. If we need them for validation studies maybe we can still process them and later update them. Here's the reference for 22EE, similar queries are valid for other eras
| if "MET_pt" not in df.GetColumnNames(): | ||
| df = df.Define("MET_pt", "PuppiMET_pt") | ||
|
|
||
| df = df.Filter("((MET_pt < 20 || PuppiMET_pt < 20) && mtw1 < 20)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I rememeber correctly I think we were discussing with @NTrevisani to drop the mtw1 cut
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can further discuss this, but I do not agree at all. Without the mtw1 cut we would measure the fake rate in a region with a much larger contamination from Wjets. The largest QCD contribution is on the low mtw1, as shown in these slides by Victor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the motivation behind removing this cut?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is that so we can use the same samples for prompt rate estimation. For sure Nicolò can clarify
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also the cut can be added at configuration level, so maybe it is better to keep the largest amount of events and cut after
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As Sebastiano says, if we remove the mtw1 cut, we can use the ntuples also to measure the prompt rate:
https://github.com/latinos/PlotsConfigurationsRun3/blob/main/FakeRate/Full2018_v9/cuts.py
The mtw1 cut must be applied at histogram production level. For sure we will use some more disk space, but we will process only Data, DY, and WJets to measure fake and prompt rates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood! I will change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized that we should switch the order in which we apply the different corrections so that we do JES->JER->Type1MET instead of Type1MET->JES->JER (reccomendation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you for the suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as the ther JMECalculator module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just noticed that this file was there. Do we really need to keep this?
| if "POG" in wp: | ||
|
|
||
| if pathToJson.startswith("/cvmfs"): | ||
| if int(self.year) == 2023 : |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, this has the phi dependence for 2023, I am happy :)
| # We use the EtDependent corrections, as recommended here: https://twiki.cern.ch/twiki/bin/viewauth/CMS/EgammSFandSSRun3 | ||
| self.elecorrection_file = self.elescale_path + "/" + year + "_" + self.prodTime + key + "/electronSS_EtDependent.json.gz" | ||
|
|
||
| # We use the EtDependent corrections, as recommended here: https://twiki.cern.ch/twiki/bin/viewauth/CMS/EgammSFandSSRun3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much much cleaner and readable, thank you Sergio!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you!!!! :)
|
Hi @NTrevisani @squinto5, thank you very much for all the comments. I will try to go over them one by one, I might not address all of them at the same time :) |
|
For @NTrevisani comment:
My idea with the install was to avoid having two environments installed. In any case, from the conversation we had this morning about Muons, the issue might be fixed now with a recent version, and the old
Yes, when doing the post-processing, I faced several times an error because of the reading of thousands of files at the same time. This is particular to the 2024 campaign, since the number of files for MC is much larger than in 2022 and 2023. Because of this, I naively coded that function to try to compute de |
Ok, that would be great, thank you!
Understood, thanks! |
| correctionlib.register_pyroot_binding() | ||
|
|
||
| class JetSelMask(Module): | ||
| def __init__(self, jetId, puJetId, minPt, maxEta, UL2016fix=False, year="",eventMask=False): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Set the eventMask as True as default, so that we follow recommendations by default https://cms-jerc.web.cern.ch/Recommendations/#jet-veto-maps
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, thank you!
|
Thanks everybody for the review. I understand that all the comments have been addressed. Please confirm it is the case, then I will merge the PR. |
|
Hi, I think everything looks good on my side. Just a few thing that can be tested after and that I put here as a reminder:
Thanks again Sergio for putting this PR together! |
|
Hi @NTrevisani, @squinto5 Do you think it makes sense to wait until Monday's discussion to see what strategy we want to follow for jets and introduce the outcome as the default in the framework? |
|
For me it's ok to wait until Monday even if I think for the time being it is better to stick to the recommended approach (which is what we are currently doing) and if someone is willing to go on with Andrea's proposal can work in parallel and go through all the POG approval pipeline |
|
Hi @NTrevisani, the last commit fixes the Muon PNet working point according to what was presented yesterday. The scale factors are updated as well. |
|
Summary of the last commit (@NTrevisani @squinto5 ):
|
|
Hi @BlancoFS @NTrevisani , sorry from jumping between Mattermost and here. For the second to last point, sorry, I did't realize that the jetvetomaps where applied through another function. As per the CMSJMECalcutor, I am now not super sure that they are doing the matching in the horns condition by default, since by I quick look I don't see anything similar to this in the official repository. Additionally, this cut should be modified to be
|
|
Hi @squinto5, thank you for raising this point cause I believe it's important and I missed it. Would it be possible to do the same, but in the |
|
Hi @BlancoFS , yes you're right and I didn't think of the possibility of changing the modulo instead of the tool. In theory it should be easy to what you proposed. Manu thanks! |
We are preparing a message to JETMET to clarify. Thanks for spotting this! |
We have just received a reply from Jindrich:
@BlancoFS or @squinto5 could you please include the central JMECalculator? |
|
Hi @NTrevisani @squinto5, sorry that I didn't see your last message on time. In the meantime, I just added an option to the Please take a look and let me know. I checked, and it seems to work well. I think it's also fine to have the option in the module. |
| const auto dr2 = get_dr2(phi, eta, GenJet_phi[Jet_genIdx[i]], GenJet_eta[Jet_genIdx[i]]); | ||
| if ( ( dr2 < dr2Min ) && ( dr2 < genMatch_maxDR*genMatch_maxDR ) ) { | ||
| if (check_resolution(pt, GenJet_pt[Jet_genIdx[i]], genMatch_maxDPT)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am missing something, shouldn't the indeces here be Jet_genJetIdx[j]? Probably I am missing something, sorry if that's the case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm I would say it's fine, right? The i index is the one from the loop over CleanJets and the j is the index from the loop to GenJets
| "CleanJet_bestGenMatchIdx", | ||
| f"findGenMatched(CleanJet_pt, CleanJet_eta, CleanJet_phi, Take(Jet_genJetIdx, CleanJet_jetIdx), GenJet_pt, GenJet_eta, GenJet_phi, {maxDR}, {maxDPT})" | ||
| ) | ||
| df = df.Define("CleanJet_isNotGenInHorns", "abs(CleanJet_eta)>2.6 && abs(CleanJet_eta)<3.1 && CleanJet_bestGenMatchIdx == -1") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The horns region is 2.5 < |eta| < 3.0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed! Thank you
| for j, tag in enumerate(["up", "down"]): | ||
| variation_pt = f"jetVars.pt({2*i+1+j})" | ||
| variation_mass = f"jetVars.mass({2*i+1+j})" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok that I didn't know and is probably a bit trickier and might require touching the CMSJMECalculator
| ) | ||
| cols.append("-1.0") | ||
| # gen jet coll | ||
| cols.append("GenJet_pt") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realized, shouldn't this be something like
cols.append(f"Take(GenJet_pt, Take(Jet_genJetIdx, {JetColl}_jetIdx))") cols.append(f"Take(GenJet_eta, Take(Jet_genJetIdx, {JetColl}_jetIdx))") cols.append(f"Take(GenJet_phi, Take(Jet_genJetIdx, {JetColl}_jetIdx))") cols.append(f"Take(GenJet_mass, Take(Jet_genJetIdx, {JetColl}_jetIdx))")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind JMECalculators does it by default, please correct me if I am wrong https://gitlab.cern.ch/cms-analysis/general/CMSJMECalculators/-/blob/main/src/JetMETVariationsCalculatorBase.cc?ref_type=heads#L153. Just need to set the flag to True, I believe
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @NTrevisani and @BlancoFS looking at the modified smear file we were given by Jindrich I noticed that it does the assumption that when we call the tool if we a jet is not matched the corresponding genjet_pt==-1. Luckily, I believe all the ingredients are inside the official JME tool, since it has a flag that we can set to True to do the gen matching (https://gitlab.cern.ch/cms-analysis/general/CMSJMECalculators/-/blob/main/src/JetMETVariationsCalculatorBase.cc?ref_type=heads#L153) and if it doesn't find it sets the genpt to -1 (https://gitlab.cern.ch/cms-analysis/general/CMSJMECalculators/-/blob/main/src/JetMETVariationsCalculatorBase.cc?ref_type=heads#L127)
Hi all (and @NTrevisani),
This PR includes the latest calibrations and software needed to process Run-III campaigns. It includes some of the changes already targeted in #70. In general, changes included:
/cvmfs/latest version of scale factors in most of the post-processing modulesSome important changes that affect users:
There is a change about bin ranges introduced in correctionlib in the latest versions that is not backward compatible. Therefore, Summer24 corrections need the latest version of correctionlib and cannot be run with previous versions. In contrast, to run 2022 and 2023 there are corrections still using the old convention, and the code crashes when the latest version is used. As a quick solution, the installation is duplicated in two files:
install.sh(2024 and above)install_v12.sh(2022 and 2023)sourcing two different LCG software. The mentioned scripts create environments with different names, so one can have two installations and source
start.shif running over 2024 andstart_v12.shif running over 2022 or 2023 datasets.Please feel free to review the code and proposed changes.
Thank you,
Sergio Blanco