Further optimizations

I need a place to put down some ideas for further optimizing this crate:

## Decoder

1. [ ] We only need to call reserve once for each block of sequences. We can calculate how many bytes will be added to the decode buffer by a list of sequences. This might save some re-allocations.
2.  [ ] The way the zstd_streaming binary works is not optimal. It should just use the drain_to_writer() functions instead of reading into an intermediary buffer. That's why we have these functions.
3. [x] Read https://fgiesen.wordpress.com/2018/02/19/reading-bits-in-far-too-many-ways-part-1/ and https://fgiesen.wordpress.com/2018/02/20/reading-bits-in-far-too-many-ways-part-2/ again carefully and optimize the bitreaders further
4. [ ] The ReversedBitreader performance can be enhanced quite a bit by being less useful in the generic case. Just returning wrong values for requests of >56 bits eliminates the need for error handling on calls to the get_bits_(triple) started in #58 
5. [X] The `RingBuffer::extend_from_within` does a lot of small memcpy calls. These can be sped up a lot by not caring about precise copying of values behind the range we want to copy. Copying a/multiple u128 (where possible) speeds this up by a lot.

## Encoder

The main thing taking time in the encoder is the match finding algorithm

1. [ ] Different matcher algorithms
2. [ ] Faster hashing for the hashtable based matcher algorithm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Further optimizations #25

Decoder

Encoder

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Further optimizations #25

Description

Decoder

Encoder

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions