Amanda-Users

Re: large dumps - 2.4.2

2007-03-22 15:00:59
Subject: Re: large dumps - 2.4.2
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Thu, 22 Mar 2007 11:08:23 -0400
On Thu, Mar 22, 2007 at 10:29:40AM +0100, Jurgen Pletinckx wrote:
> 
> <Jon LaBadie>
> | Guessing here.  You are using DLT tape with a 35GB "native" capacity
> | and believe the marketing hype that they are "70GB tapes".
> | 
> | Further guessing.  The previous amanda admin is using 
> | software compression
> | (gzip) rather than letting the hardware compress things on 
> | the fly.  This
> | is very typical and normal.  If so, amanda wants to know the native
> | capacity of the tape and that is what is specified in the
> "tapetype",
> | setting.  This is probably between 33&35GB, measured with the 
> | amtapetype
> | program.  If amanda has a history of these DLE it knows their 
> | compressibility.
> | It may be more or less than the frequently claimed 50%.
> | 
> | OTOH, if hardware compression is being used, most amanda admins find
> | the 50% compression claim of the drive manufacturer to be
> optimistic.
> | Thue your admin may have listed the tapetype capacity of the drive
> | as something lower than 70GB.
> 
> I'm entirely unaware of marketing hype. Or truth, for that matter.

Tape holds a maximum amount of data (0&1 bits) called its native
capacity.  When passed through a compressor (hardware or software)
these data could represent a larger amount of your information.
Manufacturers guess at how much your data will compress using
their compressors and list their drives/tapes at the larger number.
For many formats the guess is 50% compression.  Thus your 35GB tape
is claimed to be a 70GB tape.

Problem with that is different data compresses to differing degrees.
You may have a DLE with lots of user text files that compress 80%.
Another may have mostly binary executables and only compress 35%.
Or you could even have already compressed data, including many
media formats, that either won't compress further or may acutally
expand when processed by the hardware 'compressor'.

Sometimes I also feel manufacturers use base 10 numbers rather
than base 2.  So 1K is 1000 bytes rather than 1024.  Makes a
significant difference when you get to GB, 73MB for each GB.

Using software compression is expensive in terms of cpu cycles
and system load.  In return you get better compression than
most hardware compressors, no expansion of already compressed
data, and better control of what is actually sent to the tape
and how much tape is available.

> This is what I saw in amanda.conf:
> 
> tapetype DLT-7000
> [snip]
> # taken from http://www.cs.columbia.edu/~sdossick/amanda/
> define tapetype DLT-7000 {
>         comment "DLT-IV op DLT-7000 drive"
>         length 33000 mbytes
>         filemark 8 kbytes
>         speed 5 mbytes
> }
> 
> Aaaaand you're right. Dunno how I came up with that 70G figure.
> Hrm. Combined with variable compression rates, that would account 
> for the [dump larger than tape, but cannot incremental dump 
> skip-incr disk] business. 
> 
> Where would I look for hardware vs software compression?

The 'compress' setting of the DLE.

These can be dumped with amadmin:

  amadmin <config> disklist [ <client> [ <DLE name> ] ]

-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>