BackupPC-users

Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync

2009-12-14 14:19:28
Subject: Re: [BackupPC-users] An idea to fix both SIGPIPE and memory issues with rsync
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 14 Dec 2009 14:17:01 -0500
Robin Lee Powell wrote at about 10:12:28 -0800 on Monday, December 14, 2009:
 > On Mon, Dec 14, 2009 at 07:57:10AM -0600, Les Mikesell wrote:
 > > Robin Lee Powell wrote:
 > > > I've only looked at the code briefly, but I believe this
 > > > *should* be possible.  I don't know if I'll be implementing it,
 > > > at least not right away, but it shouldn't actually be that hard,
 > > > so I wanted to throw it out so someone else could run with it if
 > > > ey wants.
 > > > 
 > > > It's an idea I had about rsync resumption:
 > > > 
 > > > Keep an array of all the things you haven't backed up yet,
 > > > starting with the inital arguments; let's say we're transferring
 > > > "/a" and "/b" from the remote machine.
 > > > 
 > > > Start by putting "a/" and "b/" in the array.  Then get the
 > > > directory listing for a/, and replace "a/" in the array with
 > > > "a/d", "a/e", ... for all files and directories in there.  When
 > > > each file is transferred, it gets removed.  Directories are
 > > > replaced with their contents.
 > > > 
 > > > If the transfer breaks, you can resume with that list of
 > > > things-what-still-need-transferring/recursing-through without
 > > > having to walk the parts of the tree you've already walked.
 > > 
 > > Directories aren't static things.  If you don't complete a run,
 > > you would still need to re-walk the whole tree comparing for
 > > changes.
 > 
To be fair, unless you are using filesystem snapshots, the directories
aren't static during an uninterrupted rsync either...

 > Why?  The point here would be to explicitely declare "I don't care
 > about directories that changed since I passed them on this
 > particular backup run; they'll get caught on the next backup run".
 > 

Again I think this goes against the grain of the needs of many users
whose number one priority is typically reliability and consistency of
backups rather than speed. If anything, people are moving more to the
notion of filesystem snapshots to ensure consistency.


 > > You can, however, explicitly break the runs at top-level directory
 > > boundaries and mount points if you have a problem with the size.
 > 
 > That doesn't always work; it certainly doesn't work in my case.
 > Millions of files scattered unevenly around a single file system; I
 > don't even know where the concentrations are because it takes so
 > long to run du/find on this filesystem, and it degrades performance
 > in a way that makes the client upset.
 > 

I wonder how common your use case is where the files are scattered so
unevenly, so unpredictably, and in such a dynamically changing manner
that you can't make a dent in the complexity by subdividing the share
into smaller pieces. If the system is so dynamic and unpredictable,
then perhaps the more robust solution is to see whether the data
storage can be organized better...

------------------------------------------------------------------------------
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/