Skip to content

Swarm-DISC/SwarmPAL-test-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SwarmPAL test data

A repository to generate and distribute datasets used in unit tests of SwarmPAL. This repository uses git-lfs to efficiently store NetCDF files generated by SwarmPAL, whilst SwarmPAL uses Pooch to cache files locally for testing.

Each NetCDF file is generated from a corresponding YAML configuration file with the SwarmPAL CLI.

Install SwarmPAL

Install the latest version of SwarmPAL in a virtual environment. The most portable way to do this is by running the following commands in a terminal:

python -m venv
. venv/bin/activate
pip install swarmpal

The swarmpal CLI should now be available in the terminal.

Usage

The download.sh script can be used to generate the NetCDF files and update the registry.txt files with filenames and hashes.

Adding new datasets

Special care has to be taken when adding new datasets and unit tests in SwarmPAL to ensure that unit tests will use their corresponding datasets.

The following has to be done when a new dataset is needed for unit tests in SwarmPAL:

  • Create a new configuration file for the dataset in config/
  • Generate the NetCDF file with the download.sh script by running
    ./download.sh config/<dataset>.yaml
    Keep note of the new hash value of registry.txt printed when the script finishes.
  • Add the new .yaml, .nc4 and updated registry.txt files to this git repository's main branch and push the changes to GitHub.
  • While developing unit tests, it is useful to have Pooch download datasets from the main branch of this repository. To do this, in the SwarmPAL repository, change the SWARMPAL_TEST_DATA_VERSION to PEP440 local version by adding a +dev to the end of the string. For example, if the current value of SWARMPAL_TEST_DATA_VERSION is vA.B.C change it to vA.B.C+dev.
  • Update the hash of registry.txt in SwarmPAL repository.
  • Add unit tests to SwarmPAL using the new dataset.
  • If you are happy with the new tests, create a tag in this repository and push it to GitHub:
    git tag <new_tag>
    git push origin <new_tag>
  • In the SwarmPAL repository, update SWARMPAL_TEST_DATA_VERSION to the same value of the new tag. Commit the new unit tests and the changes to SWARMPAL_TEST_DATA_VERSION together. This will ensure Pooch will always fetch the dataset that corresponding to the new unit tests even if tests or datasets update in the future.

About

Hosting datasets for unit tests of SwarmPAL.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages