BackupPC-users

Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-14 21:22:47
Subject: Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 14 Dec 2009 20:20:07 -0600
Robin Lee Powell wrote:
> 
> Asking rsync, and ssh, and a pair of firewalls and load balancers
> (it's complicated) to stay perfectly fine for almost a full day is
> really asking a whole hell of a lot.

I don't think that should be true.  There's no reason for a program to quit 
just 
because it has been running for a day and no particular limit to what ssh can 
transfer.   And tcp can deal with quite a lot of lossage and problems - unless 
your load balancers are NATing to different sources or tossing RST's when they 
fail over.

 > For large data sets like this,
> rsync simple isn't robust enough by itself.  Losing 15 hours worth
> of (BackupPC's) work because the ssh connection goes down is
> *really* frustrating.

I don't think it is rsync or ssh's problem, although you are correct that rsync 
could be better about handling huge sets of files.  Both should be as reliable 
as the underlying hardware.

> In both cases, the client-side rsync uses more than 300MiB of RAM,
> with --hard-links *removed* from the rsync option list.  Not
> devestating, but not trivial either.
> 
>> So, I think the best way for improvement that would be consistent
>> with BackupPC design would be to store partial file transfers so
>> that they could be resumed on interruption. Also, people have
>> suggested tweaks to the algorithm for storing partial backups. 
> 
> Partial transfers won't help in the slightest: the cost is the time
> it takes to walk the file tree, which is what my idea was designed
> to avoid: re-walking the tree on resumption.
> 
> Having said that, if incrementals could be resumed instead of just
> thrown away, that would at least be marginally less frustrating when
> a minor network glitch loses a 15+ hour transfer.

I've always thought that activating the --ignore-times option should be 
controlled separately instead of hard coded into the full runs.  If you didn't 
activate that, fulls could be almost as fast as incrementals so you could just 
do all fulls (with the server side hit of rebuilding the tree for each run). 
Maybe you could try knocking it out of lib/BackupPC/Xfer/Rsync.pm to see if it 
makes fulls fast enough to run all the time.

-- 
   Les Mikesell
    lesmikesell AT gmail DOT com


------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>