Releases: tlk00/BitMagic
BitMagic release v6.0.0
Release Notes: BitMagic 6.0.0
-
This is major version release; it changes the serialization format for
bit-vectors. New serialization format now uses more compact
variable byte encoding for internal technical information.
New release is backward compatible, old BLOBs does not need to be
re-encoded. Old programs needs to be recompiled to read new
serailized BLOBs. Compression rate improvements should within 1%
but for large collections of bit-vectors and sparse vectors it makes
positive difference. -
New serialization format now supports bookmarks for efficient
range deserialization.
Bookmarks increase serialization size, so it is not enabled by default.
If you are not using range deserialization you do not need bookmarks.
serializer::set_bookmarks(bool enable, unsigned bm_interval);
If you use range deserialization (and want it to work faster) you need to setup
bookmarking to allow deserializer to rewind through the BLOB when range
deserialization request is processed.
bm_interval parameter in this case defines how often bookmarks are added between
blocks of encoding. Block size is 64K elements. bm_interval = 64
would mean serializer will inject a skip reference every 64 blocks.
Increasing this number keeps the BLOB smaller at the expense of the rewind
precision. This affects performance and size not the result of the deserialization.
Range deserialization would work even without bookmarks.
- Bookmarks and range deserialization works for all bit-transposed sparse vectors.
Code snippet to illustrate setting bookmarks:
// serialize sparse vector
bm::sparse_vector_serial_layout<svector_u32> sv_lay;
{
BM_DECLARE_TEMP_BLOCK(tb)
sv1.optimize(tb);
bm::sparse_vector_serializer<svector_u32> sv_serializer;
sv_serializer.set_bookmarks(true, 64);
sv_serializer.serialize(sv1, sv_lay);
}
const unsigned char* buf = sv_lay.buf();
bm::sparse_vector_deserializer<svector_u32> sv_deserial;
sv_deserial.deserialize_range(sv2, buf, 1, 4);
https://github.com/tlk00/BitMagic/blob/bm-6.0.0/samples/svsample08/svsample08.cpp
-
Build environment for cmake fixes, enabled pthreads
credits to Joshua Marshall (jrmarsha at mtu.edu). -
Bit-trasposed sparse vector added copy_range() method for efficient range slicing
str_sparse_vector<>::copy_range(...) -
Bit-transposed rank-select compressed succinct vector added copy_range() method
for range slicing. rsc_sparse_vector<>::copy_range(...) -
Fixed serialization performance regression, unnecessary initialization
of temporary buffers (introduced in v.5.0.0) -
Fixed minor corner case crash in copying of empty bit-vectors
Technical notes:
http://bitmagic.io/bm-6.0.0.html
BitMagic release v5.3.0
Release Notes: BitMagic 5.3.0
-
New method to find first mismatch between two bit-vectors. bm::bvector<>::find_first_mismatch(..)
It was possible to use XOR operation to identify all mismatches, new method is works faster for cases
when only first mismatch is important. New method is optimized for for SSE4.2 and AVX2.
If mismatch not found it returns false which is an indication that two vectors are identical. -
New method to check bit-vectors for equality. bool bm::bvector<>::equal(const bvector& bvect) const
3.bm::operation_deserializer<> class simplified deserialization methods signatures not to require to pass an
external scratch memory (temp block). Algebra of Sets sample simplified to reflect this change.
https://github.com/tlk00/BitMagic/tree/master/samples/bvsetalgebra
- Fixed compilation defects in C language mappings.
Application notes:
http://bitmagic.io/bm-5.3.0.html
BitMagic release v5.2.0
Implements selective (gather or range) deserialization for all types of bit-transposed sparse vectors.
New deserialization allows selective extraction out of compressed BLOBs in one operation which is faster, simple and allocates less. New API is based on logical AND operation on bit-vectors.
Notes:
http://bitmagic.io/bm-5.2.0.html
BitMagic release v5.1.1
-
Fixed corner case crash on OR-ing multiple empty bit-vectors using overloaded operator syntax:
bv_t = bv0 | bv1 | bv2; -
Sign of life with WebAssemblies
BitMagic library traces its legacy to 32-bit systems and still maintains compatibility there
It helped in this case to run in browser environment. v.5.1.1 includes some tweaks to better enable some platform features like build-in popcount, lead zero count, etc.
There is still work to better understand how to do exceptions and error conditions handling, but it is already clear
that the library passes the core tests in all major browsers supporting WebAssemblies.
-
Important fixes to cleanup warnings on case fall through and shadowed parameters, better support some C++17 features like [[fallthrough]]
-
rsc_sparse_vector<> container (rank-select compressed vector for scalar ints) - implemented proper back-insert iterator.
-
Succinct containers: back insert iterators now do on the fly memory optimization. The new feature reduces main memory footprint of ETL processes in environments with memory pressure.
BitMagic release v5.0.0
-
Fixed crash related to agressive -O3 optimizations on GCC
-
Implemented new algorithm: lower_bound search for integer in
bit-transposed container: bm::sparse_vector_scanner<>::lower_bound()
Documented as an API sample svsample07.cpp
https://github.com/tlk00/BitMagic/blob/master/samples/svsample07/svsample07.cpp -
New compressed serialization using Binary Interpolated Encoding.
Tested on Gov2 collection it gives approximately 25% improvement in disk footprint comparing to
previous version which was using Delta Elias Gamma encoder.
New serialization is backward compatible, BitMagic will read old BLOBs.
New serialization default is level 5. If you like to keep using Elias Gamma - use level 4.
BitMagic release v3.4.0
Release Notes: BitMagic 4.0.0
Implemented 64-bit address mode for indexing problems with more than 4 billion elements.
#define BM64ADDR to enable the new mode or use
#include bm64.h
Known limitations: it only supports 48-bit (2^48-1) elements, you cannot use both 32-bit and
64-bit address modes in one compile unit (implementation is based on pre-processor)
Added new example to explain how to enable 64-bit mode.
https://github.com/tlk00/BitMagic/blob/master/samples/bvsample01_64/bvsample01_64.cpp
Added bm::bvector<>::erase()
method to delete a bit in bit-vector
Added bm::bvector<>::shift_left()
BitMagic release v5.4.0
Release Notes: BitMagic 5.4.0
-
Fixed minor corner case bug in bm::bvector<>::invert().
The bug was inversion of a vector of size() == 0. -
Minor performance improvements for
bm::bvector<>::find_first_mismatch(..) - find first mismatch between two bit-vectors. -
New bit-transposed sparse vector algorithm.
template
bool sparse_vector_find_first_mismatch(const SV& sv1,
const SV& sv2,
typename SV::size_type& midx);
Algorithm helps finding first mismatch between two bit-transposed vectors
without reverse de-transposition, using operations on bit plains.
- New example for bm::sparse_vector_find_first_mismatch(...)
https://github.com/tlk00/BitMagic/tree/master/samples/svsample09">
Use case explains how to implement DNA compression using bit-transposed sparse vector and
construct comparison function without decode.
http://bitmagic.io/dna-compare.html
Technical notes:
http://bitmagic.io/bm-5.4.0.html
BitMagic release v3.20.0
Release Notes: BitMagic 3.20.0
-
bm::str_sparse_vector<> - bit-transposed succinct container for strings
now implements const_iterator and back_insert_iterator.
Both iterators implement transparent transpositions of the container
elements and work significantly faster than random element access methods. -
Added new examples to illustrate usage of bm::str_sparse_vector<>
-
bm::bvector<> - implemented new, improved algorithms for bit-shifting of
compressed blocks -
Fixed number of warnings and compile regressions for 32-bit systems
-
Fixed logical bug in SSE4.2, AVX2 implementations of fused SHIFT-RIGHT-AND.
The bug affected sparse vector search algorithms.
BitMagic release v3.18.0
Release Notes
- This release was about optimizations of search algorithms in bit-transposed sparse strings
(memory efficient dictionaries) see str_sparse_vector<> in bmstrsparsevec.h
Current release implemented re-mapping of dictionary charactes based on their presense in the transposed plains.
This trick is makes bit-matrix more succinct and facilitates faster search.
Optimizations resulted in 2x times improvement in both linear dictionary scan (unsorted column)
and binary search (sorted index). Some extra large cases (hundreds of millions of strings)
showed performance parity with STL map with up to 20x times (sic!) memory footprint advantage.
- Updated example for memory efficient dictionaries (it uses NASA NED extragalactic database)
to provide new performance metrics and explain optimization methods.
see xsample05.cpp
Tech.Note: http://bitmagic.io/star-search.html
BitMagic release v3.17.0
-
Optimizations of memory compression for (SSE4.2 and AVX2)
Faster bvector<>::optimize(), faster serialization for SIMD builds. -
New experiental container for bit-transposed sparse
strings (for memory efficient dictionaries)
see str_sparse_vector<> in bmstrsparsevec.h -
New example/benchmark for memory efficient dictionaries (xsample05.cpp)
Tech.Note: http://bitmagic.io/star-search.html