This repository contains common corpora used for lossless compression testing and benchmarking.
Detailed descriptions of the files found in each of the corpus can be found below.
Corpus | URL | Notes |
---|---|---|
Canterbury | https://corpus.canterbury.ac.nz/ | Includes artificial, calgary, canterbury, large, and miscellaneous corpus. |
Silesia | http://sun.aei.polsl.pl/~sdeor/index.php?page=silesia | |
Snappy | https://github.com/google/snappy | Test data with some duplicates removed that were present in other corpus. |
Neuro | https://github.com/neurolabusc/zlib-bench | NIfTI format brain images. |
All files are the works of their respective authors. Please see the sources above for any licensing information.