Amanda-Users

Re: dump to tape failed - how to diagnose?

2003-12-30 15:17:45
Subject: Re: dump to tape failed - how to diagnose?
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 30 Dec 2003 15:15:08 -0500
On Tue, Dec 30, 2003 at 02:50:57PM -0500, John Dalbec wrote:
> 
> 
> Jon LaBadie wrote:
> 
> >On Tue, Dec 30, 2003 at 11:04:29AM -0500, John Dalbec wrote:
> >
> >>I can't find anything in /tmp/amanda or /var/log/messages.  Am I just not 
> >>looking in the right places?
> >>Thanks,
> >>John
> >
> >
> >You are not using any holding disk, is that intentional?
> 
> I have two holding disks specified in my config.  What are you seeing that 
> makes you conclude that I'm not using any holding disk?  I have about 4.5G 
> available on the larger holding disk.  Actually the two failed filesystems 
> are too large to fit in the holding disk.  Is that part of the problem?

The amreport you posted said something like "error dumping to tape".
Dumping, if I understand the message's terminology, means doing the
actual backup (tar or dump) and "to tape" meant they were not going
to the holding disk, but directly to tape.

Also, there is no dumper stat data for those two DLE's.  So it doesn't
sound like it dumped successfully to disk and failed to tape.


Are they the only two which are too large to fit in the holding disk?

Is the holding disk being used for the other DLE's?  Just curious,
it may have nothing to do with your problem.  In fact, looking back
it appears they do.  All have dumper and taper stats with differing
times which suggests to me a dump to disk followed by a taping.

Have you changed your "reserve" parameter from its default 100%
(i.e. reserve 100% for incrementals which you do not do)?
I guess as you do not do incrementals, a reserve of 0% is appropriate.

> >Do you mean your /tmp/amanda was empty?
> 
> No, I mean that I can't find anything that looks like an error.  I see a 
> message "can#t send SCSI commands" but it appears too many times to be the 
> cause of the two failures.

I'd be suspicious of this.  Perhaps when put under the load of both
dumping and taping the same data without a holding disk in-between,
the scsi system is put under different stresses than when not going
to the buffering holding disk.

> >
> >Were you looking at those locations on both the server and the client.
> 
> In this case the server is the client.

Gee, for future 'enhancement', it might be nice for the report to show
which system was the server.  Never thought about that.  mail3 is an
active email server and your tape backup, ok.

BTW your reports could use some spacing adjustment by using the
columnspec directive in amanda.conf.

jl
-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)