BackupPC-users

Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-14 01:45:12
Subject: Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync
From: Shawn Perry <redmopml AT comcast DOT net>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Sun, 13 Dec 2009 23:42:33 -0700
You can always run come sort of disk de-duplicater after you copy without -H

On Sun, Dec 13, 2009 at 9:56 PM, Jeffrey J. Kosowsky
<backuppc AT kosowsky DOT org> wrote:
> Robin Lee Powell wrote at about 20:18:55 -0800 on Sunday, December 13, 2009:
>  >
>  > I've only looked at the code briefly, but I believe this *should* be
>  > possible.  I don't know if I'll be implementing it, at least not
>  > right away, but it shouldn't actually be that hard, so I wanted to
>  > throw it out so someone else could run with it if ey wants.
>  >
>  > It's an idea I had about rsync resumption:
>  >
>  > Keep an array of all the things you haven't backed up yet, starting
>  > with the inital arguments; let's say we're transferring "/a" and
>  > "/b" from the remote machine.
>  >
>  > Start by putting "a/" and "b/" in the array.  Then get the directory
>  > listing for a/, and replace "a/" in the array with "a/d", "a/e", ...
>  > for all files and directories in there.  When each file is
>  > transferred, it gets removed.  Directories are replaced with their
>  > contents.
>  >
>  > If the transfer breaks, you can resume with that list of
>  > things-what-still-need-transferring/recursing-through without having
>  > to walk the parts of the tree you've already walked.
>  >
>  > This should solve the SIGPIPE problem.  In fact, it could even deal
>  > with incrementals from things like laptops: if you have settings for
>  > NumRetries and RetryDelay, you could, say, retry every 60 seconds
>  > for a week if you wanted.
>  >
>  > On top of that, you could use the same retry system to
>  > *significantly* limit the memory usage: stop rsyncing every N files
>  > (where N is a config value).  If you only do, say, 1000 files at a
>  > time, the memory usage will be very low indeed.
>  >
>
> Unfortunately, I don't think it is that simple. If it were, then rsync
> would have been written that way back in version .001. I mean there is
> a reason that rsync memory usage increases as the number of files
> increases (even in 3.0) and it is not due to memory holes or ignorant
> programmers. After all, your proposed fix is not exactly obscure.
>
> At least one reason is the need to keep track of inodes so that hard
> links can be copied properly. In fact, I believe that without the -H
> flag, rsync memory usage scales much better. Obviously if you break up
> backups into smaller chunks or allow resumes without keeping track of
> past inodes then you have no way of tracking hard links across the
> filesystem. Maybe you don't care but if so, you could probably do just
> about as well by dropping the --hard-links argument from RsyncArgs.
>
> I don't believe there is any easy way to get something for free here...
>
> ------------------------------------------------------------------------------
> Return on Information:
> Google Enterprise Search pays you back
> Get the facts.
> http://p.sf.net/sfu/google-dev2dev
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/