gitlab-backup: Set GZIP_RSYNCABLE=yes so borg can dedup
gitlab-backup produces gzipped tarballs that cannot be meaningfully deduplicated by borg. This can be mitigated by passing --rsyncable to gzip.
The above is verified by creating two new borg repositories, adding
the two most recent gitlab.archlinux.org archives to both, with the
difference of re-compressing the tarballs with gzip -1 --rsyncable
before adding them to the second repository.
In the first case, the 215.97 GB backup archive gets compressed and deduplicated down to 176.24 GB. With --rsyncable it gets reduced to just 12.79 GB. These numbers are for /srv/gitlab/data/backups only, but the other non-tarballed files get sufficiently deduped already.
Based on the above, I am hoping to see the borg repository for gitlab shrink over time from the current 3 TB to around 600 GB which is more manageable.
Merge request reports
Activity
requested review from @svenstaro and @jelle
Should help with #360 (closed).
added 1 commit
- 2222767c - gitlab-backup: Set GZIP_RSYNCABLE=yes so borg can dedup
enabled an automatic merge when the pipeline for 2222767c succeeds
mentioned in commit 90a51d33
mentioned in issue #360 (closed)
@svenstaro I think I've found the
--rsyncable
caveat you were thinking of when we were discussing this on IRC. Compression takes longer (by about 35%) with the flag and the overall borg-backup.service duration was 6.3 hours during the last run (compared to 5-5.5 hours usually).The smaller archive size might make up for the increased compression time; we'll find out tonight.