BackupPC-users

Re: [BackupPC-users] RsyncP problem

2009-12-07 14:12:18
Subject: Re: [BackupPC-users] RsyncP problem
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 07 Dec 2009 13:08:52 -0600
Harald Amtmann wrote:
> So, for anyone who cares (doesn't seem to be anyone on this list who 
> noticed), I found this post from 2006 stating and analyzing my exact problem:
> 
> http://www.topology.org/linux/backuppc.html
> On this site, search for "Design flaw: Avoidable re-transmission of massive 
> amounts of data."

It's documented behavior, so not a surprise.

>    5. Now I make a second incremental back-up of home and home1. Since I have 
> already backed up these two modules, I expect them both to be very quick. But 
> this does not happen. In fact, all of home1 is sent in full over the LAN, 
> which in my case takes about 10 hours. This is a real nuisance. This problem 
> occurs even if I have this in the config.pl file on server1:
>       $Conf{IncrFill} = 1;

You have the wrong expectations. Do you have a reasonably current 
version, and did you read the section on $Conf{IncrLevels} in 
http://backuppc.sourceforge.net/faq/BackupPC.html?  You can also just do 
full runs instead of incrementals - they take a long time as the target 
has to read the files to verify the block checksums, but not a lot of 
bandwidth.

> The cure for this design flaw is very easy indeed, and it would save me 
> several days of saturated LAN bandwidth when I make back-ups. It's very sad 
> that the authors did not design the software correctly. Here is how the 
> software design flaw can be fixed.
> 
>    1. When an rsync file-system module module1 is to be transmitted from 
> client1 to server1, first transmit the hash (e.g. MD5) of each file from 
> client1 to server1. This can be done (a) on a file by file basis, (b) for all 
> the files in module1 at the same time, or (c) in bundles of say, a few 
> hundred or thousand hashes at a time.

The rsync binary on the target isn't going to do that.

>    2. The BackupPC server server1 matches the received file hashes with the 
> global hash table of all files on server1, both full back-up files and 
> incremenetal back-up files.

Aside from not matching rsync, the file hashes have expected collisions 
that can only be resolved by a full data comparison.  And there's no 
reason to expect all of the files in the pool to have been collected 
with an rsync transfer method.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com


------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing. 
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>