Skip to content

Commit

Permalink
docs: updated version documentation (#173)
Browse files Browse the repository at this point in the history
* distinguish between analysis task and implementation versioning
* add details about versioning approach

---------

Co-authored-by: Enrico Guiraud <[email protected]>
  • Loading branch information
ekauffma and eguiraud authored Jul 7, 2023
1 parent c33e6a0 commit ca06789
Show file tree
Hide file tree
Showing 2 changed files with 60 additions and 25 deletions.
4 changes: 2 additions & 2 deletions docs/taskbackground.rst
Original file line number Diff line number Diff line change
Expand Up @@ -213,7 +213,7 @@ This section explains how to calculate the various systematic uncertainties we i

6. Statistical Model
---------------------------------------------------------------
The following description details building a statistical model in the ``HistFactory`` format. This format is documented `here <https://root.cern/doc/master/group__HistFactory.html>`_ (``ROOT``) and `here <https://pyhf.readthedocs.io/en/latest/intro.html#histfactory>`_ (``pyhf``).
The following description details building a statistical model in the ``HistFactory`` format. This format is documented in the `ROOT documentation <https://root.cern/doc/master/group__HistFactory.html>`_ and the `pyhf documentation <https://pyhf.readthedocs.io/en/latest/intro.html#histfactory>`_.

We want to develop a statistical model, parameterized by some physical parameters :math:`\vec{\alpha}`.
We have one parameter of interest, the :math:`t\bar{t}` cross-section, and a handful of *nuisance parameters*, which are physics parameters that are not of interest in this analysis.
Expand Down Expand Up @@ -313,7 +313,7 @@ It is a future goal to move onto a more sophisticated architecture, as the BDT m

BDT Performance
---------------------------------------------------------------
The results in this section are specific to the models provided in the `repository /<https://github.com/iris-hep/analysis-grand-challenge/tree/main/analyses/cms-open-data-ttbar/models>`_. The results will differ some with each model-training.
The results in this section are specific to the models provided in the `repository <https://github.com/iris-hep/analysis-grand-challenge/tree/main/analyses/cms-open-data-ttbar/models>`_. The results will differ some with each model-training.

We can first qualitatively compare the top mass reconstruction by the trijet combination method and the BDT method by comparing their distributions to the truth top mass reconstruction distribution.
The events considered here are those in which it is possible to correctly reconstruct all jet-parton assignments in the leading four jets.
Expand Down
81 changes: 58 additions & 23 deletions docs/versionsdescription.rst
Original file line number Diff line number Diff line change
@@ -1,44 +1,79 @@
.. _versions-description:

AGC Versions
AGC Analysis Task Versions
================================

The below table gives a brief overview of all AGC versions.
The below table gives a brief overview of the AGC versions. Each version here corresponds to a slightly altered task.

.. list-table:: AGC Versions
:widths: 15 16 18 18 15 18
.. list-table:: Versions
:widths: 12 22 22 22 22
:header-rows: 1

* - Version
- Datasets
- Available Pipelines
- Cuts
- Machine Learning
- Systematics
* - 0.1.0
- CMS 2015 Open Data (POET)
- Pure ``coffea``; ``coffea`` with ``ServiceX`` processors; ``ServiceX`` followed by ``coffea``
- Exactly one lepton with :math:`p_T>25` GeV; at least four jets with :math:`p_T>25` GeV; at least one jet with :math:`b`-tag > 0.5
- None
-
* - 0.2.0
- Machine Learning
* - 0
- CMS 2015 Open Data (POET)
- Pure ``coffea``; ``ServiceX`` followed by ``coffea``
- Exactly one lepton with :math:`p_T>25` GeV; at least four jets with :math:`p_T>25` GeV; at least one jet with :math:`b`-tag > 0.5
- :math:`t\bar{t}` sample variations, ``pt_scale`` variations, ``pt_res`` variations, ``btag`` variations, **W + jets** scale variations, luminosity
- None
-
* - 1.0.0
* - 1
- CMS 2015 Open Data (NanoAOD)
- Pure ``coffea``; ``ServiceX`` followed by ``coffea``
- Exactly one lepton with :math:`p_T>25` GeV; at least four jets with :math:`p_T>25` GeV; at least one jet with :math:`b`-tag > 0.5
- :math:`t\bar{t}` sample variations, ``pt_scale`` variations, ``pt_res`` variations, ``btag`` variations, **W + jets** scale variations, luminosity
- None
-
* - 2.0.0
* - 2 (WIP)
- CMS 2015 Open Data (NanoAOD)
-
-
- Exactly one lepton with :math:`p_T>30` GeV; at least four jets with :math:`p_T>30` GeV; at least one jet with :math:`b`-tag > 0.5 (see :ref:`versions-cuts` for additional cuts)
- :math:`t\bar{t}` sample variations, ``pt_scale`` variations, ``pt_res`` variations, ``btag`` variations, **W + jets** scale variations, luminosity
- BDT to predict jet-parton assignment in :math:`t\bar{t}` events
-


Reference Implementation Versions
================================
This section is specific to the implementation in the `main repository <https://github.com/iris-hep/analysis-grand-challenge>`_.

The below table gives a brief overview of the different tags of the reference implementation, including descriptions of minor versions and patches which are implementation-specific.
Note that the major versions (0, 1, and 2) correspond to differences in analysis task (described above), while minor versions are reserved for individual implementations to assign for small changes and patches.
Our reference implementation for each major task (0, 1, 2) will always be the latest tag within that series.

.. list-table:: Tags
:widths: 8 5 29 29 29
:header-rows: 1

* - Tag
- Version
- Available Pipelines
- Systematics
- Dependency Management
* - 0.1.0
- 0
- Pure ``coffea``; ``coffea`` with ``ServiceX`` processor
- Systematic variations within ``coffea`` processor are manually calculated using ``awkward`` array logic (jet :math:`p_T` variations are not propagated through signal region observable calculation)
- Functions used in ``coffea`` processor are defined in the notebook
* - 0.2.0
- 0
- Pure ``coffea``; create cached files using ``ServiceX`` queries followed by standalone ``coffea`` processing
- Systematic variations within ``coffea`` processor are manually calculated using ``awkward`` array logic (jet :math:`p_T` variations are not propagated through signal region observable calculation)
- Functions used in ``coffea`` processor are defined in the notebook
* - 1.0.0
- 1
- Pure ``coffea``; create cached files using ``ServiceX`` queries followed by standalone ``coffea`` processing
- Systematic variations within ``coffea`` processor are manually calculated using ``awkward`` array logic (jet :math:`p_T` variations are not propagated through signal region observable calculation)
- Functions used in ``coffea`` processor are defined in the notebook
* - 1.1.0
- 1
- Pure ``coffea``; create cached files using ``ServiceX`` queries followed by standalone ``coffea`` processing
- Systematic variations within ``coffea`` processor are manually calculated using ``awkward`` array logic (jet :math:`p_T` variations corrected)
- Functions used in ``coffea`` processor are defined in the notebook
* - 2.0.0 (WIP)
- 2
- Pure ``coffea``; create cached files using ``ServiceX`` queries followed by standalone ``coffea`` processing; optional machine learning component (with additional option to use ``NVIDIA Triton`` inference server)
- Systematic variations within ``coffea`` processor handled by ``correctionlib``
- Modules are shipped to ``dask`` workers using ``cloudpickle``


Datasets
================================
Expand Down

0 comments on commit ca06789

Please sign in to comment.