Amanda-Users

Re: AIT-2 length specifier, et. al.

2005-09-01 04:48:06
Subject: Re: AIT-2 length specifier, et. al.
From: Paul Bijnens <paul.bijnens AT xplanation DOT com>
To: Mason Loring Bliss <mason AT blisses DOT org>
Date: Thu, 01 Sep 2005 10:34:57 +0200
Mason Loring Bliss wrote:

I've got an inherited Amanda system I'm managing, and we're using an AIT-2
tape drive / changer. I see the following:

define tapetype AIT-2 {
          comment "Generic AIT 2 Drive -- real world numbers"

I'm 99% sure that this is with hardware compression enabled too.

          length 41000 mbytes
          filemark 1000 kbytes
          speed 2920 kps

}

Now, we've got data compression turned on on the tape drive, which should
theoretically make more space, and we also have clients doing compression.

"theoretically"...  It depends on the data!
As simple test case try this:

   $ cp /boot/vmlinuz /tmp/large        # any large file is ok
   $ gzip large                         # and use software compression
   $ compress < large.gz  > taped # and use a program
                                        # with a similar algoritm as
                                        # the firmware in the tapedrive
   $ ls -l large.gz taped
   -rw-r--r--    1 x   y    1052126 Sep  1 09:55 large.gz
   -rw-rw-r--    1 x   y    1413493 Sep  1 09:56 taped


1. Why would the "length" parameter end up being so much shorter with
hardware compression enabled on that drive? This confuses me.

Because the simple algorithms that compress data behave very bad
on really random data, or already compressed data.
Newer tapedrives (like LTO) have better algorithms, just as does
gzip itself.


2. Am I to take it from the derived speed parameter that dumps will go
quicker without hardware compression enabled? It seems strange that
hardware compression would create that sort of impact, even when it's
dealing with already-compressed data.

First you have the speed at which the tape passes the writehead.  You
may assume that normally this speed is constant (*).  Then you have
the speed of the hardware compression unit.  Usually the hw compr unit
can deliver bits faster than the writing head can write it to tape.
And then you have the computer that has to deliver the bits to the
tapedrive.

Let's assume you don't have any bottlenecks; in that case the writing
speed is determined by the writing head.  My tapedrive does 3Mbyte/sec
according to the specs.  Without hardware compression, that's also the
speed that the computer should deliver the bytes to the drive.

Let's assume that the hardware compression of the tapedrive can get
a 50% reduction in bytes.  Now the computer has to deliver 6Mbyte/sec
to the tapedrive, while the bits on the tape are still written at
3 Mbyte/sec.

But the hardware compression algorithm in many tapedrives behave
bad in some pathlogical cases.  One of them is: already compressed data
as shown above: the data is expanded by about 35%.  While the bits on
the tape are still written at 3 Mbyte/sec, the computer now can deliver
the bytes only at 1 / 1.35 = 0.74% of the speed, or about 2.2 Mbyte sec.

The amtapetype command uses exactly the above observation to detect
hardware compression in a hardware independent way.


(*) the fast drives today may slow down the writing head speed when
the computer cannot deliver the bytes fast enough; this avoids the
shoeshining effect that older drives have in that case:  stop, rewind
a little, start again, etc. which results in a very slow tape, and
heavy wear and tear on the mechanics of the drive.


I'll try to schedule some time to run amtapetype at work, but we're in the
middle of a move, so I might not get the chance for a while, and if I can
safely get more speed and space out of my tapes with such a small change,
I'd like to do so.

Software compression is generally better (if you have the CPU-power
available), because then amanda can do better calculations of tapecapacity. In that case, disable hardware compression, because
you loose tapecapacity (and speed).

When you're lucky to own a tapedrive with a decent hardware compression
algorithm that does not behave bad on already compressed data, then you
use the best of both worlds:  use software compression to let amanda
handle the calculations better, and leave some data uncompressed for
those computers where you need the CPU-power for other purposes.


--
Paul Bijnens, Xplanation                            Tel  +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM    Fax  +32 16 397.512
http://www.xplanation.com/          email:  Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now:  exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt,  abort,  hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e,  kill -1 $$,  shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ...  "Are you sure?"  ...   YES   ...   Phew ...   I'm out          *
***********************************************************************



<Prev in Thread] Current Thread [Next in Thread>
  • Re: AIT-2 length specifier, et. al., Paul Bijnens <=