Amanda-Users

How amtapetype determines tape mark size

2003-02-08 14:37:20
Subject: How amtapetype determines tape mark size
From: "John R. Jackson" <jrj AT purdue DOT edu>
To: gene_heskett AT iolinc DOT net
Date: Sat, 08 Feb 2003 13:51:28 -0500
>John, could you refresh us on how tapetype goes about determining 
>the size of a drives filemark?

The amtapetype program is actually pretty well commented, but it goes
like this.

During the first pass, it writes files that are estimated to be 1%
of the expected tape capacity.  It gets the expected capacity from
the -e command line flag, or defaults to 1 GByte.  In a perfect world
(which means there is zero chance of this happening with tapes :-),
there would be 100 files and 100 file marks.

During the second pass, the file size is cut in half.  In that same
fairyland world, this means 200 files and 200 file marks.

In both passes the total amount of data written is summed as well as the
number of file marks written.  At the end of the second pass, quoting
from the code:

   * Compute the size of a filemark as the difference in data written
   * between pass 1 and pass 2 divided by the difference in number of
   * file marks written between pass 1 and pass 2.  ...

So if we wrote 1.0 GBytes on the first pass and 100 file marks, and
0.9 GBytes on the second pass with 200 file marks, those additional 100
file marks in the second pass took 0.1 GBytes and therefor a file mark
is 0.001 GBytes (1 MByte).

Note that if the estimated capacity is wrong, the only thing that happens
is a lot more (or less, but unlikely) files, and thus, file marks,
get written.  But the math still works out the same.  The -e flag is
there to keep the number of file marks down because they can be slow
(since they force the drive to flush all its buffers to physical media).

>Maybe I'm being paranoid, but I ask because it seems to be so large 
>(5.5 megs) on a freshly tapetype derived tapetype posted here 
>recently. 100 gig drive, but still...

5 MBytes is 0.005% of 100 GBytes.  That's the size of a dust particle :-).

All sorts of things might have happened to cause the amount of data
written to vary enough to generate this file mark size guess.  A little
more "shoe shining" because of the additional file marks (and flushes),
dirt left on the heads from the first pass of a brand new tape, the
temperature/humidity changed during the multi-hour run, a different amount
of data was written after the last file mark before EOT was reported, etc.

Note that the file mark size might really be zero for whatever device this
is, and it was just the measured capacity variation that caused amtapetype
to think those extra file marks in pass 2 actually took up space.

It also explains why amtapetype used to sometimes report a negative file
mark size if the math happened to end up that way.  When that happens
now we just report it as zero.

>Gene

John R. Jackson, Technical Software Specialist, ITaP/RCS, jrj AT purdue DOT edu

<Prev in Thread] Current Thread [Next in Thread>