Releases: bacpop/pp-sketchlib
Releases · bacpop/pp-sketchlib
Sketchlib 1.5.1
Important bugfix (see bacpop/PopPUNK#95):
- Inputs to the CPU query API were being sorted by name, returning distances not in the expected order. This will have worked with
poppunk_sketch
but not through other software such as PopPUNK, unless the input was alphabetical.
Sketchlib v1.5.0
New features:
- Allow turning off random match computation (#28), which is useful for querying.
- Replace threads with OpenMP (#31). Much more efficient CPU parallelisation of distances.
- Add GPU/CPU hybrid sketching algorithm for read data (#32).
- Add codon phased seeds
--codon-phased
(#35). This replaces randomly spaced seeds.
Bug fixes:
- Update to work with HighFive v2.2.2
- Check for zero Jaccard distances (#36)
Random match correction
New features:
- Add parallelised sparse and dense matrix operations (#21, #23)
- Used spaced seeds in hash function; store sketching version in HDF5 file (#22)
- Chunk up distance calculation on the GPU, so any size is supported (#24)
- Calculate random match chances via Monte Carlo simulation (#27)
Added CI testing:
- Add new tests for matrix functions
- Add test for distances compared to reference values
Test to be manually run:
- Compare GPU and CPU distances
K-mer length enhancements
Adds the following features:
- Extract Jaccard distances
- Calculate and save genome length using minimum hash
- Calculate and save base composition
- Adjust comparisons for base composition, and expected random matches
- Spaced seeds for short k-mer lengths
Bug fixes:
- Error when calculating distances on a GPU (#18)
GPU accelerated distances
New features:
- Distance calculations on CUDA compatible GPUs.
- Use armadillo linear solver rather than dlib for regressions (on CPU).
Added some benchmarks to the readme
Improved speed and memory
New features:
- Implement countmin correctly and pick sensible default parameters for sketching read datasets (#11)
- More reliable and faster code for optimiser used in linear regression
- Add testing and CI
Bug fixes:
- Made CMakeLists.txt and Makefile more general (removed specific paths)
Patched release
Small fixes:
- #14: Use the
mkl_rt
library to link, which works better with conda (see https://groups.google.com/forum/#!msg/kaldi-help/m3nyQke0HS0/4fj8gkSWAgAJ, https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/748309, ContinuumIO/anaconda-issues#720) - #14: As the sequential library cannot be specified, export this as an environment variable in the
api.cpp
functions which handle their own threading. - Correct version of gcc used to compile
Initial release
Features tested and working on OS X and Linux. Install looks ok, locally