You can always run come sort of disk de-duplicater after you copy without -H
On Sun, Dec 13, 2009 at 9:56 PM, Jeffrey J. Kosowsky
<backuppc AT kosowsky DOT org> wrote:
> Robin Lee Powell wrote at about 20:18:55 -0800 on Sunday, December 13, 2009:
> >
> > I've only looked at the code briefly, but I believe this *should* be
> > possible. I don't know if I'll be implementing it, at least not
> > right away, but it shouldn't actually be that hard, so I wanted to
> > throw it out so someone else could run with it if ey wants.
> >
> > It's an idea I had about rsync resumption:
> >
> > Keep an array of all the things you haven't backed up yet, starting
> > with the inital arguments; let's say we're transferring "/a" and
> > "/b" from the remote machine.
> >
> > Start by putting "a/" and "b/" in the array. Then get the directory
> > listing for a/, and replace "a/" in the array with "a/d", "a/e", ...
> > for all files and directories in there. When each file is
> > transferred, it gets removed. Directories are replaced with their
> > contents.
> >
> > If the transfer breaks, you can resume with that list of
> > things-what-still-need-transferring/recursing-through without having
> > to walk the parts of the tree you've already walked.
> >
> > This should solve the SIGPIPE problem. In fact, it could even deal
> > with incrementals from things like laptops: if you have settings for
> > NumRetries and RetryDelay, you could, say, retry every 60 seconds
> > for a week if you wanted.
> >
> > On top of that, you could use the same retry system to
> > *significantly* limit the memory usage: stop rsyncing every N files
> > (where N is a config value). If you only do, say, 1000 files at a
> > time, the memory usage will be very low indeed.
> >
>
> Unfortunately, I don't think it is that simple. If it were, then rsync
> would have been written that way back in version .001. I mean there is
> a reason that rsync memory usage increases as the number of files
> increases (even in 3.0) and it is not due to memory holes or ignorant
> programmers. After all, your proposed fix is not exactly obscure.
>
> At least one reason is the need to keep track of inodes so that hard
> links can be copied properly. In fact, I believe that without the -H
> flag, rsync memory usage scales much better. Obviously if you break up
> backups into smaller chunks or allow resumes without keeping track of
> past inodes then you have no way of tracking hard links across the
> filesystem. Maybe you don't care but if so, you could probably do just
> about as well by dropping the --hard-links argument from RsyncArgs.
>
> I don't believe there is any easy way to get something for free here...
>
> ------------------------------------------------------------------------------
> Return on Information:
> Google Enterprise Search pays you back
> Get the facts.
> http://p.sf.net/sfu/google-dev2dev
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
>
------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|