Deduplication on .tar files #1606
-
|
I am plannig on using rustic with proxmox container. I backup LXC with compression disables to a ZFS volume with compression enabled; and then use rustic to backup huges .tar files. Is there any way to improve deduplication of files inside .tar files? |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
|
Nothing to do. If your tar file contain the same content it will be deduplicated automatically. The best what you can do is to try. I have no experience with LXC but for VMs Proxmos backup files do not dedup almost at all. PVE is using VMA format for its backup - and in a nutshell it uses 65536 bytes internal “chunks”, written out of order and each with its own header which for example contains header number. It means that slightest change in VM most likely generates completely different file and such granular changes make deduplication impossible. For me only PBS was the option. BTW - they support S3 as a destination nowadays. |
Beta Was this translation helpful? Give feedback.
-
|
@enboig is there a reason why you use tar files instead of just backup the original files? Using uncompressed tar before processing files with rustic does not have much (if any at all) benefits. Generally, avoid backing up compressed files. Besides this, it depends on the format of the LXC files as @kapitainsky correctly mentioned. |
Beta Was this translation helpful? Give feedback.
-
|
When doing a LXC backup, it generates a .tar file. My question was if the chunker was smart enough to break a .tar file by its content, or if there was a way to help him to do so (similar to --rsyncable for gzip and rsync). I have already disabled compression for .tar file, but wanted to know if there was anything else I could do. And I would prefer not adding/configuring rustic for each lxc and allow proxmox to do the job. |
Beta Was this translation helpful? Give feedback.
Yes it is.
Here algorithm overview:
https://restic.net/blog/2015-09-12/restic-foundation1-cdc/