Amanda-Users

Re: hardware vs software compression (was Re: amflush/amcheck not in sync?)

2003-04-24 10:15:29
Subject: Re: hardware vs software compression (was Re: amflush/amcheck not in sync?)
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: Jeroen Heijungs <Jeroen.Heijungs AT Het-Muziektheater DOT nl>, amanda-users AT amanda DOT org
Date: Thu, 24 Apr 2003 10:10:35 -0400
On Thu April 24 2003 03:41, Jeroen Heijungs wrote:
>At 12:23 23-04-03 -0400, Gene wrote:
> >To allow amanda to have a good view of the tape, and because
> > gzip can compress better than the hardware, sometimes by large
> > amounts, its generally recommended the drives compressor be
> > shut off permanently by this group.  It less amanda hassle in
> > the long view.
>
>It was recommended to me to NOT use software compression, with
>the following reason:
>
>"If a software compressed file is damaged, the complete file/tape
> is not readable anymore and therefore useless. If it is NOT
> compressed the rest of the tape/file may be readable, and
> therefore probably restorable."

This isn't smething I've observed.  With the file structure on the 
tape consisting of a tape id header of 32k, a file header of 32k, 
which describes the file in plain english plus howto recover it, 
followed by the file itself, repeating last two steps for each 
entry in the disklist, its quite well laid out and the 2 bare metal 
recoveries I've done here were without incident.  While its 
possible that a gzipped file could be damaged by a tape error, IMHO 
the file is going to be damaged by that same tape error even if it 
wasn't compressed.

>I have never had the opportunity to test this, does anyone has
> some thoughts and/or comments on this?
>I now use the hardware compression, and not the software
> compression, the tapes are big enough, so there is no real
> problem for the time being.
>
As long as the tapes are big enough, its not a problem, but you 
usually have to tell amanda the tapes are 15 to 20% smaller than 
the propaganda claims because while you /etc dir may compress very 
well, that dir full of archive or music is going to expand in the 
hardware compressor, as that sort of stuff has already been 
smunched and isn't further compressible.

Useing gzip, and the uncompressed tape capacity often gets you 
surpriseing results.  From last nights run (I have a DDS2 changer):
------------
These dumps were to tape DailySet1-26.
The next tape Amanda expects to use is: DailySet1-27.


STATISTICS:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    0:28
Run Time (hrs:min)         4:00
Dump Time (hrs:min)        1:41       1:28       0:13
Output Size (meg)        3186.8     3000.7      186.1
Original Size (meg)      9166.6     8785.4      381.2
Avg Compressed Size (%)    33.9       34.0       32.4   
(level:#disks ...)
Filesystems Dumped           37         10         27   (1:26 3:1)
Avg Dump Rate (k/s)       540.6      582.6      249.9
--------------
The hardware compressor cannot match that.  That was a mixed bag of 
both straight, and gzip compressed entries in the disklist.  And I 
had about 600 megs worth of tape left as I run it a bit small to 
assure room on the end for a seperate, non-amanda done, 
uncompressed tarball of tonights indices and the configs that were 
used.  Its part of the wrapper script that runs amdump.  That 
typically appends about 60 megs to the tape, but its insurance to 
me in that if I have to do a bare metal recovery, I can do an seod, 
then backup two filemarks and 'dd' recover the configs and indices 
first, thereby simplifying the rest of the recovery.

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.