Amanda-Users

Re: Unusual dump times?

2003-09-16 14:17:04
Subject: Re: Unusual dump times?
From: Eric Siegerman <erics AT telepres DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 16 Sep 2003 14:09:29 -0400
On Tue, Sep 16, 2003 at 07:35:55AM -0400, Jack Baty wrote:
> With everything finally working, I'm wondering if my dump times are 
> excessive or to be expected.
> [...]
> marvin.fusio /usr        0 70091804311267  61.5 177:48 404.1  31:352275.2
> scooby.fusio /usr        1   55450   5443   9.8   8:49  10.3   0:11 515.1
> stewie       -e/projects 1     200    200   --    0:02  96.7   0:009069.9
> stewie       -e/software 1    2290   2290   --    0:06 379.9   0:013149.8
> stewie       hda2        2   14380   2535  17.6   1:26  29.6   0:03 996.3

That depends on so many things that it's hard to give a simple
answer: client hardware, O/S, dump vs. gtar, many small files vs.
fewer big ones, network technology, network saturation, etc. etc.
etc.  (And if you set all those out for me, I'd still have
difficulty saying "yes, it's reasonable" or "no, it isn't".)

> I plan to gradually include more machines 
> totalling about 20GB. If all the hosts take as long as marvin (below), 
> things could end up taking more than 12 hours to run.

Well, to do a full backup on them all, maybe.  But you won't be
doing that -- as with the run you quoted, most of the DLE's are
doing incrementals on any given night, so the one or two full
backups dominate the stats.

We run a two-configuration setup here (three actually, but two of
them are similar enough that for this discussion I'm treating
them as one):
  - A daily backup to disk (i.e. file:), which is a standard
    Amanda configuration of mixed fulls and incrementals

  - A weekly full backup of everything to tape

The weekly backup is about 50 GB, and takes about 23 hours.
That's why it runs Friday night :-)

But the last 30 dailies took between 0:37 and 4:33 each, with 80%
of them under 3 hours.  Sending them to tape would presumably
slow down the total duration, but with enough holding disk, the
impact on the clients shouldn't be affected much.

(There's lots of optimization I could do, for both configurations
-- I'm not at all happy with the level of parallelism I'm
getting.  So far I haven't needed to worry about it.)

So if you're using a standard configuration, where you let Amanda
schedule full and incremental backups, I'd add in the rest of
your DLEs and let Amanda run for a dumpcycle or two before
worrying too much about it.  Just add them in a few at a time, or
you *will* face some very long dump times at first, since the
first dump Amanda does of any given DLE has to be a level-0.


> Wondering if I 
> should just stop using compression.

Again, that would depend on just what the bottleneck is.  If it's
CPU usage on the client, try reducing from --best to --fast as
someone else suggested, or try changing to server-side
compression.  If it's network bandwidth, go the other way:  move
compression from server to client, and/or increase the
compression level until you start maxing out the client CPU.

You can't make meaningful optimizations until you know what to
optimize for, and you can't know that until you can find out, or
hypothesize, which resource is saturated.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        erics AT telepres DOT com
|  |  /
When I came back around from the dark side, there in front of me would
be the landing area where the crew was, and the Earth, all in the view
of my window. I couldn't help but think that there in front of me was
all of humanity, except me.
        - Michael Collins, Apollo 11 Command Module Pilot


<Prev in Thread] Current Thread [Next in Thread>