Amanda-Users

Re: Multi-tape span failure

2007-10-31 11:44:40
Subject: Re: Multi-tape span failure
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 31 Oct 2007 11:38:01 -0400
On Tue, Oct 30, 2007 at 11:31:53PM -0500, Tom Hansen wrote:
> 
> BACKGROUND INFO: I have Amanda 2.5.2p1 running on Ubuntu linux 6.10, 
> configured to backup several large (300Gb +) filesystems spanning 
> several tapes.  I have a robot changer, LTO1 tapes (100Gb capacity) and 
> I used:
> 
>    tape_splitsize 3Gb
>    fallback_splitsize 256m
> 
> (An unrelated issue: I couldn't seem to be able to get split_diskbuffer 
> to have any effect so the chunks were all 256mb.  No big deal, it was 
> not a bottleneck.)
> 
> After much time configuring, everything seems to be working properly, 
> and on my first big run, it successfully spanned six tapes and was 
> nearly finished.  Then it grabbed tape 7, which I had inadvertently left 
> in "write protect" mode.  Unfortunately, at this point Amanda completely 
> aborted the entire 800+ Gb backup and left nothing in the index, thus 
> completely wasting 7+ hours of backup time.
> 
> This behavior is unexpected and bad.  What if a tape simply goes bad 
> during a run? If I'm running 7 or 8 tapes each backup, I don't want to 
> lose the whole thing if there's an error on the last tape!
> 
> I _thought_ that Amanda was programmed to simply go to the next tape 
> when a tape error occurs.  In this case, if Amanda _had_ gone to the 
> next tape, it could have completed the job, since tape 8 was a good tape.
> 
> MY QUESTION:  Is there any way to configure Amanda such that such a tape 
> error would simply go to the next tape, instead of the worst possible 
> action, which is to abort the whole job?
> 
> Short of that, is there any way Amanda could start up from where it left 
> off?
> 

Short answer - no.  If the backups are in a holding disk they can
still be flushed to tapes, but resume a backup no.


Something in your report is amiss.  If amanda had successfully
used 6 tapes, it would have completed backing up and taping
one or more of your 300GB DLE's.  There is no reason a failed
tape after that would invalidate those backups.  And your
report (emailed or available with amreport) would show that.

Also, IIRC, an LTO-1 tape at full speed takes about 1.5-2 hrs
to tape completely.  I would expect 6 successful tapes to take
longer than 7 hours, more like 10-15, not counting the estimate
phases.  Might the estimate phase have taken 7 hours and then
amanda rejected all the tapes as inappropriate, never writing
to them?

-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>