Amanda-Users

Re: Getting Amanda to production

2005-02-04 19:05:53
Subject: Re: Getting Amanda to production
From: Gene Heskett <gene.heskett AT verizon DOT net>
To: amanda-users AT amanda DOT org
Date: Fri, 4 Feb 2005 18:59:17 -0500
On Friday 04 February 2005 14:43, Gil Naveh wrote:
>hello,
>
>I have a few question in mind so I'll be able to better configure
> it. 1) So far I tested Amanda and backed up data on a hard drive
> and am quit happy with the results. Recently we got a new tape
> drive and I need to start backing up our data into taps. I am going
> to use the same Amanda server and clients. Should and if so how do
> I initialize Amanda so it won't care about previous backups?
> Additionally I'll probably have to do some more testing in order to
> see how much time/bandwidth Amanda uses when it backup to our tape
> drive. But after doing those tests - can I delete those files from
> the tape?
>
>2) I understand that Amanda has its algorithm that decides when do
> to a full backup, or an incremental one.
>   But I am a little confused with labeling tapes. (amlabel)
>   When I used the hard-disk I define each hard-drive 'partition' as
> 4GB and then amlabel each disk e.g.
>   #/usr/sbin/amlabel DailySet1 HISS01 slot 1
>   #/usr/sbin/amlabel DailySet1 HISS02 slot 2 ...
>   However, I have tapes that each can store 400 compress data,

You meant 400MB, or 400GB?

>   and 
> we have to store about 30GB - am I domed to use each tape for one
> backup - or can I 'partition' those tapes?

Amanda does not to my knowledge, support partitioned tapes.  I think 
it would be going against the basic premise amanda has of having the 
data safest at all times.  Therefore amanda uses a different tape 
each run, and only re-uses that tape when its time to re-use it.  It 
will not let you 'accidently' overwrite last nights backup with 
tonights unless _you_ force it to.

The usual practice is to setup the schedule which amanda uses only as 
a guide, for a dumpcycle of say 7 days.  You'll want at least 14 
tapes in order to have a complete backup image available at all 
times, and possibly even backup up a few days if something wrong 
isn't promptly discovered.  So you have 14 tapes, the tapecycle then 
is 14 days.  Then you tell amanda how many runs in that dumpcycle 
days, like maybe you only do tuesday morning to saturday morning, 5 
times a week, so the runspercycle then would be set to 5.  Amanda 
then looks at what she has to do in that time frame, and will adjust 
the schedule of who gets a level 0 and who gets incrementals in order 
to satisfy the schedule you have given amanda as a target schedule 
AND to try and use about the same amount of the media each night.  
This 'balance' adjustment is an ongoing process, and you'll get 
emails after every run describing what was done during the run just 
completed.  With anything sane for the target numbers, I've never 
seen amanda put off a level 0 that was needed by more than 1 day.

>  Do I have to put each tape in the tape drive and run amlabel?

Yes, thats the required method.

>3) We have a local and remote sites that we have to backup any
> thoughts, comments on what should I keep in mine before doing so.

Yes.  It would be advisable to stay away from anything on a samba 
share.  It doesn't support ctime, so the data always looks new and 
gets a level 0 every night even if the level has advanced to 4.  
Install an amanda client setup on each of the machines and use that 
instead.  Or use rsync which I mention below.

> Additionally any recommendation for what to use to secure our data
> when backing up remotely? I read that I can use TCP wrappers but
> how do I implement it with Amanda?

Tcpwrappers is a security control tool, and wouldn't hurt anything 
once setup to pass the amanda traffic.  I use it on my firewall box, 
watching the internet side of things, along with portsentry and 
iptables.  Call me paranoid...  But as far as security of the data is 
concerned, one would want to wrap the amanda functions in an ssh or 
sftp in order to scramble it enough to be secure while in transit if 
thats a concern.  I have not done that, so I'll defer to others here 
who may have and let them give the 'howto' advice.

I'd point out that rsync, once the initial images in the holding area 
have been made, does a 64k block checksum on the src and target 
files, and only exchanges the real data if the checksums are 
different.  This checksum traffic would help the security issue by 
drowning out the real data with the checksum traffic, once its in 
place and running.

Bear in mind of course that this, along with gzip compression does 
require some horsepower to be expended in the client machines, but, a 
gzipped file also takes up less network bandwidth so a multimachine 
environment won't be bringing the network to it knees quite so fast.
And the potential to speed up the backups by offloading the jobs to 
the clients so that many of them can be doing their thing all at the 
same time can save you several hours in a larger, 20 machine system.
The secret is to give each machine/drive, a different, unique spindle 
number in the disklist entry as amanda will not run more than one 
operation on the same spindle at the same time.

Generally, we don't recommend using the drives hardware compression.  

It hides the true capacity of the tape from amanda, and the actual 
capacity is then a factor of how compressable the data is.  A 
directory full of gzipped tarballs or rpm/debs/etc will probably be 
expanded by the hardware compressor unless its smart enough to pass 
them untouched.  If amanda is in command of the compression, then the 
bytes amanda counts going down the cable can be counted quite 
accurately and 99% of the tape used without any errors.  Start out by 
compressing everything, and then remove the compression from those 
DLE's that your email says didn't compress by very much.  No use 
wasting time and horsepower with gzip if the file is only being 
compressed to 88% of its original size.  Thats simply not worth the 
horsepower to do, when the /etc dirs often compress to less than 20% 
of the original size.

>4) Finally, we have one tape drive and about 20 servers that we need
> to backup from. Some of those servers run on Solaris, some on
> Windows. Can I back all servers on the same tape, or do I need to
> set separate tapes for Win machines?

Ouch, theres that word, windows.  No, they don't need seperate tapes, 
but since amanda, other than in a cygwin environment, has not been 
made to run natively on windows, and that group hasn't kept us apace 
of the progress of that effort, that does present a problem.

One solution is to run rsync earlier in the evening so its done by the 
time amanda gets fired off, putting mirrors of the windows stuff to a 
local holding disk area, and then backup that holding area rather 
than trying to backup the windows boxes themselves.  There is a very 
good windows 'rsync' client.  In that case, you do the recovery back 
to the holding area, then interchange the src and target arguments to 
rsync, and restore the windows machines in that manner, all pretty 
much hands off.  With 200GB drives at the commodity pricing levels 
these days, that is much less of an expense than it would have been 
2-3 years ago.  The solaris boxes shouldn't be a problem as I think 
Jon LaBadie will advise you on those if there are any gotcha's.

>Many thanks,
>gil

-- 
Cheers, Gene
"There are four boxes to be used in defense of liberty:
 soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.32% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2005 by Maurice Eugene Heskett, all rights reserved.

<Prev in Thread] Current Thread [Next in Thread>