Le mercredi 02 septembre 2009 à 12:10 +0200, Pieter Wuille a écrit :
> Hello everyone,
>
> trying to come up with a way for efficiently synchronising a BackupPC archive
> on one server with a remote and encrypted offsite backup, the following
> problems
> arise:
> * As often pointed out on this list, filesystem-level synchronisation is
> extremely cpu and memory-intensive. Not actually impossible, but depending
> on the scale of your backups, it is maybe not a practical solution.
> In our case of a 350GiB pool containing 4 million directories and 20 miilion
> inodes, simply locally copying the whole pool using
> cp/rsync/xfsdump/whatever thrashes, gets killed by OOM or at least takes
> days, longer than i find reasonable for a remote synchronisation run.
> * Furthermore, we want our offsite backup to be encrypted - in the ideal case
> using a secret key that is at no single moment ever known at the remote
> location - there should only be encrypted files sent to it, and stored
> there.
> Doing this encryption at the file level given such massive amount of small
> files, is a very serious additional overhead.
> * The alternative to file-level synchronisation is (block)device-level
> synchronisation. Many possibilities exist here, including ZFS send/receive
> (if you use ZFS), using snapshots (eg. LVM) or temporarily stopping backups,
> and do a full copy of the pool to the remote side (if you have enough
> bandwidth), etc... Not everyone is willing to use these, or is prepared to
> convert to such a system.
> * We would like to use Rsync for this, since it will skip identical parts, yet
> guarantee that the whole file is byte-per-byte identical to the original.
> Unfortunately, as far as I know, rsync doesn't support data on block devices
> to be synced, only the block device node itself. In addition to that, rsync
> needs to read and process the whole file on the receiver side, calculate
> checksums, send them all to the sender side, wait for the sender to
> reconstruct the data using the checksums, send this reconstruction, and
> apply this reconstruction at the receiver side. This requires at least the
> sum of the times to read through the whole data on both sides if it is a
> single file (correct me if i'm wrong, i don't know rsync internals). Data
> hardly moves on-disk in the case of a BackupPC pool, so we would like to
> disable or at least limit the range in which rsync searches for matching
> data.
>
> To overcome this issue, i wrote a perl/fuse filesystem that allows you to
> "mount" a block device (or real file) as a directory containing files
> part0001.img, part0002.img, ... each representing 1 GiB of data of the
> original device:
>
> https://svn.ulyssis.org/repos/sipa/backuppc-fuse/devfiles.pl
>
> This directory can be rsynced in a normal way with an "ordinary" directory
> on an offsite backup. In case a restore is necessary, doing
> 'ssh remote "cat /backup/part*.img" >/dev/sdXY' (or equivalent) suffices.
> Although devfiles.pl has (limited) write support, rsync'ing to the resulting
> directory is not yet possible - maybe i can try to have this working if
> people have a need for it. This would allow restoration by simply rsync'ing
> in the opposite direction.
> Doing the synchronisation in groups of 1GiB prevents rsync from searching
> too far, and splitting it in multiple files allows some parallellism
> (sender transmitting data to receiver, while receiver already checksums
> the next file; this is heavily limited by disk I/O however).
>
> In our case, the BackupPC pool is stored on an XFS filesystem on an LVM
> volume, allowing a xfsfreeze/sync/snapshot/xfsunfreeze, and using
> devfiles.pl on the snapshot. Instead of xfsfreeze+unfreeze, a backuppc
> stop/umount + mount/backuppc start is also possible. If no system for making
> snapshots is available, you would need to suspend backuppc during the whole
> synchronisation.
> In fact, the BackupPC volume is already encrypted on our backup server
> itself, allowing very cheap encrypted offsite backups (simply not sending
> the keyfile to the remote side is enough...)
>
> The result: offsite backups of our 400GiB pool, containing 350GiB data, of
> which about 2GiB changes daily, is synchronised 5 times a week with offsite
> backup in 12-15 hours, requiring nearly no bandwidth. This seems mostly
> limited by the slow disk I/O on the receiver side (25MiB/s).
>
> Hope you find this interesting/useful,
Hi.
This seems to be an interesting approach to solve the offsite backups
problem. I'll try to test this when I have some time.
thanks
>
> --
> Pieter
>
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
> trial. Simplify your report design, integration and deployment - and focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now. http://p.sf.net/sfu/bobj-july
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
--
Daniel Berteaud
FIREWALL-SERVICES SARL.
Société de Services en Logiciels Libres
Technopôle Montesquieu
33650 MARTILLAC
Tel : 05 56 64 15 32
Fax : 05 56 64 15 32
Mail: daniel AT firewall-services DOT com
Web : http://www.firewall-services.com
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|