-
-
Notifications
You must be signed in to change notification settings - Fork 780
Open
Description
#8803 opens the door for a unique opportunity: re-chunking while doing a borg2 transfer
(which will be required anyway for transferring archives from borg1 repos to borg2 repos).
So, if borg2 gets a new chunker before it is released, we could use it there and convert relatively painlessly.
Usually one can not easily switch to a new chunker within an existing repo:
- new-chunked chunks of identical files do not deduplicate with old-chunked ones
- thus, space usage doubles as long as old-chunked archives are present (== for a very long time in usual pruning scenarios)
Requirements for new chunker:
- little to no C code, rather Cython, Python. (*)
- better security properties than buzhash, see https://github.com/borgbackup/borg/wiki/CDC-issues-reported-2025
- not too slow, preferably similarly fast or faster than buzhash
- better to maintain code (buzhash is too much C)
- could be a separate project (like borghash, borgstore, now borgchunk(er)?)
(*) in the borg codebase. nothing against a well-maintained chunker library with more low-level code that is external.
Existing chunkers in borg:
fixed
(fixed block size, relatively simple, fast, Python/Cython, can support sparse files efficiently)buzhash
(variable block size, CDC, complex, hard to maintain C code, no sparse files support)
Chunker tickets: