Re: [BackupPC-users] Using rsync for blockdevice-level synchronisation o

Le mercredi 02 septembre 2009 à 12:10 +0200, Pieter Wuille a écrit :
> Hello everyone,
> 
> trying to come up with a way for efficiently synchronising a BackupPC archive
> on one server with a remote and encrypted offsite backup, the following 
> problems
> arise:
> * As often pointed out on this list, filesystem-level synchronisation is
>   extremely cpu and memory-intensive. Not actually impossible, but depending
>   on the scale of your backups, it is maybe not a practical solution.
>   In our case of a 350GiB pool containing 4 million directories and 20 miilion
>   inodes, simply locally copying the whole pool using
>   cp/rsync/xfsdump/whatever thrashes, gets killed by OOM or at least takes
>   days, longer than i find reasonable for a remote synchronisation run.
> * Furthermore, we want our offsite backup to be encrypted - in the ideal case
>   using a secret key that is at no single moment ever known at the remote
>   location - there should only be encrypted files sent to it, and stored 
> there.
>   Doing this encryption at the file level given such massive amount of small
>   files, is a very serious additional overhead.
> * The alternative to file-level synchronisation is (block)device-level
>   synchronisation. Many possibilities exist here, including ZFS send/receive
>   (if you use ZFS), using snapshots (eg. LVM) or temporarily stopping backups,
>   and do a full copy of the pool to the remote side (if you have enough
>   bandwidth), etc... Not everyone is willing to use these, or is prepared to
>   convert to such a system.
> * We would like to use Rsync for this, since it will skip identical parts, yet
>   guarantee that the whole file is byte-per-byte identical to the original.
>   Unfortunately, as far as I know, rsync doesn't support data on block devices
>   to be synced, only the block device node itself. In addition to that, rsync
>   needs to read and process the whole file on the receiver side, calculate
>   checksums, send them all to the sender side, wait for the sender to
>   reconstruct the data using the checksums, send this reconstruction, and
>   apply this reconstruction at the receiver side. This requires at least the
>   sum of the times to read through the whole data on both sides if it is a
>   single file (correct me if i'm wrong, i don't know rsync internals). Data
>   hardly moves on-disk in the case of a BackupPC pool, so we would like to
>   disable or at least limit the range in which rsync searches for matching 
> data.
> 
> To overcome this issue, i wrote a perl/fuse filesystem that allows you to
> "mount" a block device (or real file) as a directory containing files
> part0001.img, part0002.img, ... each representing 1 GiB of data of the
> original device:
> 
>   https://svn.ulyssis.org/repos/sipa/backuppc-fuse/devfiles.pl
> 
> This directory can be rsynced in a normal way with an "ordinary" directory
> on an offsite backup. In case a restore is necessary, doing
> 'ssh remote "cat /backup/part*.img" >/dev/sdXY' (or equivalent) suffices.
> Although devfiles.pl has (limited) write support, rsync'ing to the resulting
> directory is not yet possible - maybe i can try to have this working if
> people have a need for it. This would allow restoration by simply rsync'ing
> in the opposite direction.
> Doing the synchronisation in groups of 1GiB prevents rsync from searching
> too far, and splitting it in multiple files allows some parallellism
> (sender transmitting data to receiver, while receiver already checksums
> the next file; this is heavily limited by disk I/O however).
> 
> In our case, the BackupPC pool is stored on an XFS filesystem on an LVM
> volume, allowing a xfsfreeze/sync/snapshot/xfsunfreeze, and using
> devfiles.pl on the snapshot. Instead of xfsfreeze+unfreeze, a backuppc
> stop/umount + mount/backuppc start is also possible. If no system for making
> snapshots is available, you would need to suspend backuppc during the whole
> synchronisation.
> In fact, the BackupPC volume is already encrypted on our backup server
> itself, allowing very cheap encrypted offsite backups (simply not sending
> the keyfile to the remote side is enough...)
> 
> The result: offsite backups of our 400GiB pool, containing 350GiB data, of
> which about 2GiB changes daily, is synchronised 5 times a week with offsite
> backup in 12-15 hours, requiring nearly no bandwidth. This seems mostly
> limited by the slow disk I/O on the receiver side (25MiB/s).
> 
> Hope you find this interesting/useful,

Hi.

This seems to be an interesting approach to solve the offsite backups
problem. I'll try to test this when I have some time.

thanks

> 
> --
> Pieter
> 
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
> trial. Simplify your report design, integration and deployment - and focus on 
> what you do best, core application coding. Discover what's new with 
> Crystal Reports now.  http://p.sf.net/sfu/bobj-july
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
-- 
Daniel Berteaud
FIREWALL-SERVICES SARL.
Société de Services en Logiciels Libres
Technopôle Montesquieu
33650 MARTILLAC
Tel : 05 56 64 15 32
Fax : 05 56 64 15 32
Mail: daniel AT firewall-services DOT com
Web : http://www.firewall-services.com


------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
Re: [BackupPC-users] Using rsync for blockdevice-level synchronisation of BackupPC pools