Amanda-Users

Re: strange errors with 2.5.0p2

2006-07-28 10:48:28
Subject: Re: strange errors with 2.5.0p2
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Fri, 28 Jul 2006 10:42:28 -0400
On Fri, Jul 28, 2006 at 12:01:16PM +0100, Rodrigo Ventura wrote:
> 
> Hello all,
> 
> I'm having trouble with amdump. It has failed for the second time. The facts:
> 
> The relevant parts of the mail report:
> [includes my comments starting with a (*) ]
> 
> --------------------------------------------------
> These dumps were to tape ISR006.
> *** A TAPE ERROR OCCURRED: [No more writable valid tape found].
> Some dumps may have been left in the holding disk.
> Run amflush to flush them to tape.
> The next tape Amanda expects to use is: ISR004.
> 
> (*) NOTE: this is wrong, since there are no dumps for amflush to flush! Is it 
> because the DLE is at the same server as the amanda server?

No it is not "wrong".  "some ... may" also includes zero dumps are left.
Taper, the one reporting this message has no control over what the
other parts of amanda are doing at the time of the error, nor does it
know if other dumps are continuing to fill the holding disk after
taper fails.  So it suggests you check and take manual action.


> 
> FAILURE AND STRANGE DUMP SUMMARY:
> [...]
>   omni    /home/mn  lev 0  FAILED [out of tape]

That seems pretty straight forward given the other results below.


>   omni    /home/mn  lev 0  FAILED [data write: Connection reset by peer]
>   omni    /home/mn  lev 0  FAILED [dump to tape failed]

I'd guess your DLE "omni:/home/mn" was dumping direct to tape,
bypassing the holding disk.

> 
> [...]
> 
> USAGE BY TAPE:
>   Label        Time      Size      %    Nb    Nc
>   ISR006       2:28 26803680k   73.9    42     0
> 
> (*) 26803680kb/1024=26175.5mb is the size of the successful dumps on tape, 
> right?

And 42 pieces were taped.


> 
> [...]
> 
> NOTES:
> [...]
>   taper: tape ISR006 kb 36546272 fm 43 writing file: No space left on device
> 
> (*) What is the 36546272? The amount of data actually written on tape?
> --------------------------------------------------

Yes "kb", the total amount, successful plus failed, that had been written
when the error occured (after the 43rd piece fm = filemarks).


> 
> Now, the tape entry on amanda.conf contains:
> 
> --------------------------------------------------
> define tapetype HP-DAT-72x6-nc {
>     comment "HP autoloader DAT 72x6 (compression off)"
>     # data provided by Rodrigo Ventura <yoda AT isr.ist.utl DOT pt>
>     length 35400 mbytes
>     filemark 0 kbytes
>     speed 3002 kps
> }
> --------------------------------------------------

> 
> where 35400 is a bit lower than what amtapetype actually measured. So, 
> 36546272kb/1024=35689.7mb makes sense here.
> 

Right.


> Looking at the sendsize report for /home/mn we get:
> 
> --------------------------------------------------
> sendsize[2154]: estimate time for /home/mn level 0: 45.958
> sendsize[2154]: estimate size for /home/mn level 0: 11853650 KB
> --------------------------------------------------
> 
> so this means amanda estimates /home/mn to take 11853650kb/1024=11575.8mb
> 
> My big question is why on earth amanda tries to dump /home/mn on a tape of 
> size 35400mb with 26175.5mb already taken up, since 26175.5 + 11575.8 = 
> 37751.3 > 35400 !!!
> 

You noted in your tapetype

  "where 35400 is a bit lower than what amtapetype actually measured."

amanda has no way of knowing "how much lower" (or higher) this tape
will be.  So rightly or not, amanda uses that number to decide during
planning what to backup and what levels, but once dumping starts,
it follows that plan even if it seems, after the fact, that perhaps
it shouldn't.  Along with this is that taper starts to tape as long
as there is still room on the tape.  When it began the 11575 mbyte
was an estimate.  Coulda been smaller, coulda been bigger.  Tape
coulda been 35400, maybe larger maybe smaller.  After the results
are in we know your tape was bigger (35689 mbyte) and your dump
was larger than 9514 mbyte (35689 - 26175).


-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>