Skip to content
Dan Maher edited this page Jul 18, 2014 · 9 revisions

Compliance Checker internals

Portions of the compliance checker are modeled after standard Python unit tests.

Checkers or CheckSuites are similar to a unit test class, and comprise all the checks pertaining to a compliance standard.

Checkers are comprised of check methods, which are methods of the class that begin with the string check_. Each check method is passed a DSPair object. A check method is expected to return:

  • None (essentially a skip)
  • A Result object
  • An iterable (list) of Result objects

A DSPair object is an internal object, and depending on what type of file is being inspected, will contain a NetCDF Dataset object or an XML document loaded via etree, along with a Wicken Dogma object that provides attributes that inspect the other object.

A Result object is a container for data indicating success or failure of a check. It contains:

  • A priority (BaseCheck.HIGH, BaseCheck.MEDIUM, BaseCheck.LOW)
  • A value
    • A 2-tuple of (passed, total)
    • True (equivalent to (1, 1))
    • False (equivalent to (0, 1))
  • A computer-readable name of the check performed. This may be a single string, or a tuple of strings that can be grouped at each item along with other Results, ie ('variable', 'temperature', 'has_standard_name_attr')
  • An optional list of messages, typically used to indicate why something failed
  • An optional list of child Result objects (advanced usage)

Contributing to the Compliance Checker

Contributing To Existing Checkers

Let's say you were to notice an issue in the CF checker, as in your dataset is failing a check and you don't think it should. Open the cf.py file and search for the name of the check failing to find the method producing that Result. You can then adjust the method to correct any issue - fork, commit, and send a PR for discussion.

Implementing Your Own Checker

A new checker can be added to the check-suite by adding the NewCheckerBaseCheck method to the compliance_checker/runner.py file. The format for the checker can be copied from existing checkers.

Best Practices For Check Methods

  • If your check method is optional/conditional, don't return a failure - just return None
  • Use messages for failures, and be explicit - put values, variable names, etc in them.
    • Good: "Variable temperature does not have a standard_name attribute"
    • Bad: "VARIABLE IS WRONG"
  • DON'T put messages for passing checks/Results, they may confuse the end user.
  • There's a lot of flexibility to how you structure your Result objects - if you have a method that checks if a global attribute is present AND has to be a certain value, you can either return a list of two Result objects (likely using True/False for their values), OR you can return one Result object with a value of (1, 2) with a message saying something like "Attribute (attrname) present but value (the_value) was not (expected value)".
  • There's a number of helper/utility methods for CF related checks to classify variables, etc - look around cf/cf.py and cf/util.py for something that may suit your needs.
Clone this wiki locally