BackupPC-users

Re: [BackupPC-users] Multiple backuppc server

2009-07-08 20:51:15
Subject: Re: [BackupPC-users] Multiple backuppc server
From: Holger Parplies <wbppc AT parplies DOT de>
To: Les Mikesell <lesmikesell AT gmail DOT com>
Date: Thu, 9 Jul 2009 02:25:07 +0200
Hi,

Les Mikesell wrote on 2009-07-08 10:32:50 -0500 [Re: [BackupPC-users] Multiple 
backuppc server]:
> [...]

Les, you are missing the important part, so I'll begin with it and repeat it a
few times throughout the mail:

> > Tino Schwarze wrote on 2009-07-08 10:11:43 +0200 [this thread]:
> > > BackupPC is not designed to support multiple instances accessing the same
> > > storage.

> Holger Parplies wrote:
> >Tino wrote:
> >>Andy Brown wrote:
> >>> We've started to setup a large multiple server backuppc environment
> >>> [...]
> >>> Can anyone see any pitfalls with this?
> >> You will run into lots of troubles [...]
> > 
> > actually, I'm not sure you will. I'd expect subtle corruption which you
> > won't notice until it's too late. [...]
> > You might even be lucky and simply get away with it. Race conditions are
> > things waiting to happen, although they may turn out not to.
> 
> I thought someone had reported doing this successfully over NFS - using 
> a high capacity commercial NAS.

If you know there are race conditions, how much faith do you put in a report
saying it has been done "successfully" (presuming you remember correctly)?
Granted, it *might* mean that status.pl confusion will not happen. They
*might* even have figured out that the race conditions do not exist or how to
avoid them, but I wouldn't believe it without examining the reasoning behind
that, because

> > > BackupPC is not designed to support multiple instances accessing the same
> > > storage.

so it doesn't take any expensive measures to avoid race conditions resulting
from doing so anyway.

> > there's no sane way to prevent more than one instance of BackupPC_link
> > from running.
> 
> That shouldn't matter - and in fact probably happens with multiple 
> processes on a single server.

No, it probably doesn't. I checked that before writing what I wrote. Did you
check before contradicting me?

> link() should be an atomic operation so 
> creation of a hash collision should be detected even if it is simultaneous.

Detecting it is trivial. Please provide a correct implementation of *handling*
it. It's not necessarily a "hash collision" as BackupPC uses the term, by the
way. There is currently no need to handle this, because

> > > BackupPC is not designed to support multiple instances accessing the same
> > > storage.

and it avoids it happening within a single server instance, because that is
*much* easier than handling it.

> [...]
> The BackupPC_nightly run is the more dangerous part.  There you have the 
> possibility that it might delete a pool link at the same time another 
> process just re-used it.

You are correct in that this is something we want to avoid. If I were so
inclined, it would be trivial to contradict you with your own arguments,
though. Something like "link() should return a failure code if the source file
does not exist so this should be easily detected". Aside from the comment
that it's not "dangerous" to have no pool link for a file, it's just wasteful,
because you won't be able to reuse it for other copies.
But I won't do that. You are right. There should not be more than one instance
of BackupPC_nightly running on a pool, and BackupPC_nightly and BackupPC_link
should not run concurrently.

> In the current version there 
> is some sort of locking around the operations that might collide so this 
> might or might not also work on a network filesystem.

This sounds like an urban myth. Did you check how this "locking operation"
works? What version of BackupPC introduced it? I went *part of the way* through
the diffs. What I found wasn't "locking", it was design, and it will, in fact,
extend to several BackupPC servers accessing one pool. But that is only part
of the mechanism. The rest, I believe, is in fact really a form of "locking":
the provisions a BackupPC server takes to avoid two jobs from running
concurrently that shouldn't - BackupPC_nightly and BackupPC_link, including
more than one instance of either (i.e. *only one* BackupPC_nightly(*) or
BackupPC_link job may be running at one point in time). This part will
obviously *not* extend to several independent server instances accessing the
pool. In other words,

> > > BackupPC is not designed to support multiple instances accessing the same
> > > storage.

> But in any case 
> you would probably only want one nightly run and keep it outside the 
> backup window.

I have nothing to add to that.


So, do you insist on making the original poster believe that running several
instances of BackupPC on the same pool is a good idea, or can we maybe find
some other topic to disagree on?

Regards,
Holger

(*) With $Conf{MaxBackupPCNightlyJobs} you can split one BackupPC_nightly
    job into 2, 4, 8 ... processes which will run concurrently and each process
    a distinct part of the pool. In the sense of the above definition, they
    comprise one logical BackupPC_nightly entity. What you can't have is
    more than one BackupPC_nightly processing the *same* part of the pool.

------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/