Re: tape usage algorithm

On Tue, Apr 17, 2007 at 01:23:49PM -0400, Brian Cuttler wrote:
> 
> Amanda users,
> 
> I'm sure this is addressed somewhere but I've never seen it
> (perhaps because I missed it) explicitely discussed on the list.
> 
> My assumption on tape filling is that if dumps are still in 
> progress that amanda will try to write each DLE to tape as
> it completes.

Assuming a holding disk is in use, taper doesn't care about
dumps in progress.  If it is taping, it is taping a completed
dump.  When that is through, look for another completed dump
on the holding disk.  If none, wait for notification that
one is now available.

> 
> I have no idea what the algorithm is for DLE taping if there
> are multiple completed DLEs in the work area.
> 
> I have never been able to figure out the tape ordering in when
> amflush was being run.

If dumps from more than one date (run?) are present, oldest are
all done before newer.  Withing those groups there is a 'taperalgo'
parameter controlling which is selected.

> 
> Is there any sort of taper delay algorithm to optimize tape usage ?
> 

Should be, look back a few days for my note about what was proposed
in the past but never implemented.

> In the specific case here - initial look at the output leads me to
> think that the only DLE on tape 15 would have fit on tape 14, but
> the amdump "notes" section says otherwise.
> 

You are slightly misreading the notes section.  It says there was an
error, not an end of tape.  Amanda can not tell the difference with
the information that is provided to it by the tape driver.

According to the Notes, the errors (or EOT as the case may be) occured
at 11.7GB and 9.7GB.  But your tapetype says the capacity is 200GB.
The errors occured after the last successful DLE write to that tape.

I'm guessing you have LTO, I or II, probably II with its 200GB native
capacity.  Either way it looks like a real I/O error occured.


> Given the number of DLEs on tape 14 I wonder if the time/size are
> correct, could that be the value of the last DLE dumped rather
> than the total of all DLEs on the tape volume ?
> 
> > USAGE BY TAPE:
> >  Label          Time      Size      %    Nb
> >  BIONSC14       0:09   11335.7    5.8    28
> >  BIONSC15       0:11    9251.8    4.6     1
> >  BIONSC16       0:47   81491.9   40.8     1

Size here is the amount of data successfully written.  You had 28 DLE
successfully written for a total of only 11.3GB.

> > NOTES:
> >  planner: Full dump of bioquad:/usr4 promoted from 4 days ahead.
> >  planner: Full dump of friedel:/usr16 promoted from 4 days ahead.
> >  taper: tape BIONSC14 kb 11725408 fm 29 writing file: I/O error
> >  taper: retrying friedel:/usr16.0 on new tape: [writing file: I/O error]
> >  taper: tape BIONSC15 kb 9784224 fm 2 writing file: I/O error
> >  taper: retrying bionsc1:/dev/md/bg-schost-1/rdsk/d311.0 on new tape: 
> >  [writing file: I/O error]
> >  taper: tape BIONSC16 kb 83447712 fm 1 [OK]

The 'kb 11725408' (and other kb ####) are the total amount of data
written to the tape (successful, complete DLEs plus partial, failed
writes) when finished with the tape due to error or normal completion.

> > 
> Do we know what happened here and why ?
> 

What do your system messages say about any errors on the scsi system
and/or tape drive.


-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)