Amanda-Users

Re: Question to: Friday tape question - Top 10

2007-08-01 15:42:22
Subject: Re: Question to: Friday tape question - Top 10
From: "Dustin J. Mitchell" <dustin AT zmanda DOT com>
To: Ralf Auer <Ralf.Auer AT physik.uni-erlangen DOT de>
Date: Wed, 1 Aug 2007 14:33:27 -0500
On Wed, Aug 01, 2007 at 08:55:59PM +0200, Ralf Auer wrote:
> > Not sure I know what part you feel is not true.  You hypothesised
> > a situation where normally every day's dump fit on one tape.  Then
> > on one particular day a single DLE grew enough to cause the day's
> > run to require 3 tapes.  That would only happen if that one DLE
> > were now larger than a single tape (even for an incremental as the
> > scenario is constructed).  With runtapes == 1, amanda would not
> > even start the dump of that DLE.
> 
> Maybe I am completely wrong, but I thought, if the big dump is
> distributed over several partitions, each one with a single entry in the
> disklist and each partition size smaller than the tape capacity, the
> backup should work.
> For instance, my type capacity is 400GB, the specific client has three
> partitions of 350GB each and each partition has its own entry in the
> disklist. So I assumed, that one partition backup would go to the first
> tape, anther one to the second and the third one onto the last tape.
> Since 'tape spawning' is not necessary in this case, I thought that the
> backup would be run by Amanda. But, as I said, I am not sure about that
> and probably you're right...

If I can try to summarize, you're discussing situations where Amanda is
fairly massively oversubscribed; that is, Amanda has very little room to
deal with unexpected circumstances, including an overlarge incremental,
an unavailable client, etc.

In the specific situation, under "normal circumstances", you expect
Amanda to balance dumps into about 1 tape per run.  You've set runtapes
to a larger number, to allow Amanda to use more tapes if necessary, but
you don't really have enough tape to support your full retention period
with >1 run per tape.

The "correct" calculation is:
 tapecycle = reundancy_factor * runtapes * runspercycle + epsilon
where epsilon is 1 or 2 -- "spare" tapes to allow slack for damaged
tapes, etc.  The redundancy_factor is the number of full backups you'd
like to have around at any time -- 1 is OK, 2 or more is recommended.
Anything less than 1 is asking for trouble.

In your case, if I remember the numbers correctly, you had:
 tapecycle 5
 runspercycle 5
 runtapes 3
 epsilon 0
solving for redundancy_factor gives
 5 = redundancy_factor * 3 * 5 + 0
 redundancy_factor = 0.33

which is clearly suboptimal.

This is not to say that this kind of configuration won't work -- Amanda
will do her level best -- but it should not be a surprise that "level
best" is not always good enough, especially when unexpected things
happen.

I think the bottom line is: this is your Wednesday email telling you to
buy more tapes ;)

Dustin

-- 
        Dustin J. Mitchell
        Storage Software Engineer, Zmanda, Inc.
        http://www.zmanda.com/