Amanda-Users

Re: Backup is too slow - configuration?

2003-01-22 14:35:21
Subject: Re: Backup is too slow - configuration?
From: Gene Heskett <gene_heskett AT iolinc DOT net>
To: hochenaw <dalton AT utanet DOT at>, amanda-users AT amanda DOT org
Date: Wed, 22 Jan 2003 13:55:32 -0500
On Wednesday 22 January 2003 09:05, hochenaw wrote:
>Hi,
>
>we have an HP Surestore dlt vs80e and use hp dlt IV tapes
> (capacity 40/80GB) under linux.
>But the backup is to slow, we should have a backup rate of 10.8 GB
>(native)/20.6 (comprimized), but we do not have :-(
>So i have some questions, to which i did not find answers until
> now!

How did you arive at those figures?

>In the file tapelist we have following entry (i dont know why):
>-------------------------------------------------
>define tapetype HP-DLT {
>      comment "HP SureStore DLT"
>      length 80 gbytes           # conservative estimate
>      filemark 1 byte            # should work given above
>      speed 30 mbytes            # even more, but this isn't used
> in amanda
>}
>-------------------------------------------------
80 gigs would not be what our experience would call a "conservative" 
estimate.  Probably optimistic by several gigs.

>length in comprimized (80) or uncommprimized (40) modus?
>What does filemark mean?

Each drive has its own method of laying down a file seperator 
marker, with some drives not even needing it.  I don't know about 
the DLT though.  Amtapetype will give you this data too.

>speed in mbytes per minutes or ...? Does it have any consequence
> if i insert to much/less speed?

Speed is (I think) how fast the drive can stream data in xxxx per 
second.  I think thats only for use by amanda in estimating how 
long it will take to write the file being instantly processed in an 
amstatus /config/ report.  I don't actually know if this is used 
for other purposes too.

To get a much better idea of the tapes actual capacity, the latest 
snapshots have an 'amtapetype" command, older versions call it just 
tapetype.  Run this against a tape you can afford to lose the data 
on, and it will give you a much better description that you can 
incorproate into your tapetype list in amanda.conf.

>How can i activate software or hardware compression (cant find an
>entry in amanda.conf, just Client or Servers best in
> dumptypes-file)?

Hardware depends on the os in use.  Some os's have a choice of 
compressed or uncompressed drives in the device list and will turn 
the drives compression on and off according to the devicename you 
used to address it.  Linux does not however, so one must find the 
switch setting on the drive itself that turns this on/off.

Software is controlled by the entry's in the 'dumptype' you choose.

Off is the generally recommended hardware setting for use with 
amanda because if the machines have the horsepower to do their own 
compressing, they can often beat the hardware compression by quite 
useable amounts, thereby putting more on the tape than the hardware 
compressor can.  Here, the difference is about 10%.  Even though I 
do not compress every entry in the disklist, the overall average 
output size delivered to the media is about 40% of what is on the 
drives.  Thats a very usefull advantage over the 'hardware' model 
IMO.

If you are setting up a client/server lashup, you can also have the 
clients do the compressing, which has the added advantage of 
lowering the network bandwidth to move the files to the server, and 
thereby speeding things up a bit since several clients can be 
compressing at once whereas the server can only do one file at a 
time.

Also, with the hardware turned off, amanda can do a much better job 
of estimating what will fit on the tape because amanda counts bytes 
sent to the drive regardless of the content of that byte.  With 
hardware on, you can almost double the size in the tape type, but 
then amanda has no way of knowing how much the data is actually 
compressed so you have to reduce this double by a fudge factor.

When running the amtapetype (or tapetype) program, if the hardware 
compressor is on, it will often give a very conservative value for 
the tape size and it won't be anywhere near the size in the makers 
propaganda.  Thats because this utility uses /dev/urandom as the 
data source, and the data from /dev/urandom will often expand by 
10-20% in going thru a hardware compressor.

>I have differnt directories to backup, and i want to pack them in
>_one_  big package, so i specify a chunk size of 20GB, but Amanda
>creates for each entry in the disklist one package/file.

Thats doable, but a 1 gig chunk size will prevent you from running 
into filesystem limits, often at the 2 gig mark.  Trying to pack it 
all into one big file has no real value as amdump simply starts at 
the head of the tape, writing these disklist made files until it is 
done.  This way if you were to need to recover say, 
/etc/X11/XF86Config, amanda will know it can skip all the other 
files on the tape and go directly to the file containing the one 
you want.

One big file also means that since amanda cannot span to a second 
tape with any given disklist entry, if its too big for the tape, it 
will restart that disklist entry on the next tape until its either 
used all the tapes, or run out of "runtapes".  Not a desirable 
condition.

>A hardware question: If i have many packages to flush out to tape,
> are they written at once so that the tape comes into the
> streaming mode or does the tape have to be stopped and restartet
> for each package newly?

With attention to the 'dumporder' string in amanda.conf, once the 
drive starts, it will stream until the backup run is done.  Here, I 
use a string of capital "S"es so it starts with the biggest, and 
works toward smaller.  By the time the biggest is written, usually 
the other 37 entries have been processed and compressed and are 
ready to go, so the drive itself never stops till done.  Feel free 
to experiment, or even follow the advice about it in the comments 
of amanda.conf.

>What happens if i delete some files in the holding disk manually?
>Does it have consequences and the database is corrupted?

Not a good idea although I have done it. One really should amflush 
them because amanda won't redo them until its time to redo them 
again.  The database corruption if any will be self healing in 
runspercycle runs I believe, so its not fatal unless you need it 
the next day.

>What does "ignoring cruft file" as log message?

You've probably got a non-amanda generated file in the holding disk 
area.  Amanda doesn't know what to do with those and calls them 
cruft files.

>I would be happy if someone can help me!
>Maybe someone has the same configuration!
>Thx a lot, Dalton

HTH

-- 
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz  512M
99.22% setiathome rank, not too shabby for a WV hillbilly

<Prev in Thread] Current Thread [Next in Thread>