BackupPC-users

Re: [BackupPC-users] Problems with hardlink-based backups...

2009-09-01 01:21:43
Subject: Re: [BackupPC-users] Problems with hardlink-based backups...
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: mstowe AT chicago.us.mensa DOT org, "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 01 Sep 2009 01:18:28 -0400
Michael Stowe wrote at about 23:15:15 -0500 on Monday, August 31, 2009:
 > 
 > > I don't see the issue here.
 > > - New files are created only when a new file is added to the
 > >   pool. Since this happens coincident with the need for a new database
 > >   entry, these two operations can be synchronized
 > 
 > Unless there's a database problem.  Or the executable crashes.  Or a
 > programming bug.  Or a database bug.  Or the database runs out of space. 
 > Or the database crashes.  Or somebody runs any number of processes on the
 > database that can interfere with inserts.  (I can be more specific with
 > any given database.)

And such issues can and do occur with the current situation. Witness
the instances of people with corrupted attrib files or unlinked pc
tree files.  The point is that the current situation does not have an
"atomic" connection between the pool files and the attrib files. Nor
does it have an "atomic" connection even between the pool files and
the pc tree hard links. Again, I have had to write specific programs
to identify and try to correct such errors due to crashes or
filesystem issues.

 > 
 > > - Files are deleted or moved (i.e. renamed) only as part of
 > >   BackupPC_Nightly. Since this happens only once a day the database
 > >   can be locked appropriately when this process is running to make
 > >   sure that no files are deleted or renamed without being checked
 > >   against or synchronized with the database
 > 
 > And this works, until the database is out of sync, and now work needs to
 > be done to recover database orphans before files can be deleted...

Same as would happen when for example BackupPC_link fails to create a
hard link between the pool and the pc tree.

 > > - Potential race conditions might exist if multiple copies of
 > >   BackuPC_dump are running but these would be caught first at the
 > >   database level where collisions can be prevented within the database
 > >   itself assuming that file creation is made to follow database entry.
 > 
 > This is kind of in left field, a database really isn't necessary to
 > prevent what I guess you're calling "race conditions," and since separate
 > processes can insert simultaneously in virtually every database, this
 > doesn't actually solve the problem without explicit locking.

Which is why you would need to be careful of lock
conditions. Similarly, various versions of BackupPC have had to make
sure that certain processes don't run concurrently and using careful
sequencing or locking.

 > > - Otherwise, I don't see how BackupPC could possibly change files out
 > >   of synch with the database except for external events such as
 > >   crashes, disk errors, or malicious intervention - but all of these
 > >   apply also to the existing BackupPC implementation.
 > 
 > There's a yawning gap between the ability to envision what can go wrong
 > and what can actually go wrong.  Database/file hybrid systems aren't
 > exactly untrodden ground, and the phenonena of potential desynchronization
 > requires substantial code and effort to overcome, and even then it's
 > problematic if the data isn't wholly redundant between the two systems.

Exactly. They are not untrodden ground and the problems are
solvable. In fact, all the specific issues you mentioned have actually
occurred with the existing BackupPC implementation.

I am willing to admit that there may be more issues to worry about in
the hybrid database/file system but to think that such issues are
exclusive to such systems is just plain wrong. And until one analyzes
the different case and finds cases that are truly intractable, then
there is no reason to just blanket throw out this approach.

But I get the point - you are happy with the current implementation
and see no need for other approaches... Others have different needs
and priorities. Hopefully, you can accept that rather than trying to
"prove" that your way is the only possible way.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/