On Wednesday 20 November 2002 08:13, marc.bigler AT day DOT com wrote:
>Hello,
>
>I have got here a DDS-3 tape drive which has per default hardware
>compression enabled and was wondering what is the best deal with
> AMANDA. Would you guys suggest hardware compression or should I
> disable hardware compression and have software compression done
> for example with gzip ?
>
Generally speaking Marc, its a bad idea to use the drives
compressor.
1st reason is that it hides the true capacity of the tape from
amada, who counts bytes *sent* to the drive after any compression
amanda does. Since data can be compressed quite a bit, but
executables and such as the tar.gz and rpm archives are generally
already smunched, the drives compressor can't do much with them and
may in fact expand them somewhat. If you compile and run the
tape-src/tapetype program, you'll see almost the advertised
capacity of the drive if compression is off, but with it on, the
data from /dev/urandom which tapetype uses for a data source when
doing this test, isn't compressible and will probably expand,
making tapetype give a falsely low value for its size response.
2nd reason is that gzip can usually out-compress the drives hardware
RLL encoding, and usually by pretty obviously detectable amounts
except in cases similar to the archive files that are already
smunched.
For example, if you do as I do, nearly all downloads go into one
directory, and this directory doesn't get compressed since its a
waste of cpu cycles to do so. I have several others in my disklist
that also skip the compression. And I have in the past rx'd mail
from amdump indicating its used 3.5 gb of a 4 gb tape, and has
stored over 6.5gb of source data to it.
Read the emails from amanda after each run, and any entry in the
disklist that gets a level 0, and indicates a compression ratio
>100% should have the dumptype changed to one without compression,
its not further compressible. Level 1's and 2's that expand to 320
or 640 % are probably empty dirs, and can probably be deleted from
both the drive and the disklist. Here they just waste 64 blocks of
tape.
Some entries in the disklist will squeeze down to <25% of their
original size, so its an overall plus to use gzip IF you have the
cpu horsepower to do it in a reasonable time frame. Here I find
with a 1400mhz clocked athlon, that even though I'm using
server-best, finished output from gzip piles up in the holding disk
waiting to be written to tape since the tape can only do about
375kb a second. Once started, the drive never stops till the run
is done. Under those conditions, the cpu effort to do the
compression is free (except for its impact on seti@home). :-)
Some DDS drives keep a hidden header (the MRS system) on the tape
that records the compressors status. A tape once written with the
compressor on will be compressed forever regardless of the dip
switch status unless you forcibly write an amount of data large
enough to cause a buffer flush in the drive after turning off the
compression with mt or similar. I've posted a short script to do
that several times here.
--
Cheers Marc, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.19% setiathome rank, not too shabby for a WV hillbilly
|