Amanda-Users

Re: to compress or not to compress ???

2003-07-03 14:56:14
Subject: Re: to compress or not to compress ???
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: "Michael D. Schleif" <mds AT helices DOT org>, amanda mailing list <amanda-users AT amanda DOT org>
Date: Thu, 3 Jul 2003 14:51:39 -0400
On Thursday 03 July 2003 11:42, Michael D. Schleif wrote:
>Yes, I am learning -- at the expense of many questions ;>
>
>First, a brief overview:
>
>I have five (5) Linux servers, totaling ~50 Gb used diskspace,
> divided roughly even across all five.
>
>I have several DAT tape drives, the largest of which is an HP DDS-3.
>  I have twelve (12) DDS-3 tapes, and twenty (20) DDS-2 tapes, as
> well as several cleaning tapes.
>
>So far, I have configured:
>
>    dumpcycle     7
>    runspercycle  7
>    runtapes      1
>
You left out tapecycle, which is the number of tapes in the rotation 
pool, in this case it should be not less than 15.

>I am studying _Using Amanda_ here:
>
>       <http://www.backupcentral.com/amanda.html>
>
>I am confused about two (2) things:
>
>[1] Should I use hardware compression?

Not if you can help it, for the reasons I'll develop.
>
>There seem to be several schools of thought here.  I want to know
> how Amanda works with hardware compression?

Amanda can use hardware compression, but since the hardware compressor 
hides the true tape capacity from amanda, you must cheat on the 
tapetype size entry, often by significant amounts.

>  What are the
> advantages of using software compression?

Amanda can know quite well how much a tape can hold because it counts 
bytes of compressed data fed to the drive.  The tradeoff of course is 
cpu horsepower to do the compression.  In a client-sever world, the 
compression can be offloaded to the client, and several clients can 
be doing their compression in parallel, so its not as big a concern 
as it first appears.

>  What are the
> disadvantages of using *both* hardware and software compression?
>
With hardware smunching, amanda has no idea how much data has actually 
been written to the tape.  Sparse stuff can compress to maybe 1/2.6 
of its original size, but amanda doesn't have any way of knowing 
that.  OTOH, feed a bunch of tar.gz's, and .bz2's to that hardware 
compressor and it will get a tummy ache and make the output data 
stream as much as 15% bigger than the input was.
>
>[2] What are the optimal dumptypes for my situation?
>
>Yes, I have already struggled with and overcome dump vs. GNUTAR.  My
>first mistake was using comp-root and comp-user on localhost.
>
>Yes, I understand that Amanda can facilitate planning and scheduling
> full vs. incremental backups.
>
>However, I am concerned about developing a simple recovery strategy.
>  I am currently having problems with amrecover; but, I think that
> is due to short vs. FQDN usage -- so, I'll save that for another
> time.

Just make sure that your tar is at least 1.13-19, and prefereably 
1.13-25.  Indexes are fubared in earlier versions.

>I am running exclusively Debian woody on all systems.  I believe
> that I have a good working filesystem design.  I am on a fast
> network.
>
>Yes, I come from a traditional system administrator's backup
> mindset, and I do not want that to undermine Amanda's design.
>
>
>What do you think?

One thing to be aware of is that a tape, once written in the 
compressed mode, remembers that, and will overwrite your choices 
unless you go to a rather detailed method of removing the compressed 
flags.

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.26% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.