Amanda-Users

Re: speed of amdump

2003-01-26 17:56:07
Subject: Re: speed of amdump
From: "John R. Jackson" <jrj AT purdue DOT edu>
To: dalton <dalton AT utanet DOT at>
Date: Sun, 26 Jan 2003 17:01:39 -0500
>the speed of our hp surestore seems to be ok, but the amdump takes too much ti
>me 
>  (6 hours for 23 GB, we want to use now client fast compression).

There are several areas that can affect speed, and it's important to
understand the flow of data through Amanda to see where the potential
bottlenecks are:

  * The client dump program reads data from the client disk.

      - How fast is the disk?

      - How fast is the disk -> system connection (e.g. SCSI bus/controller)?

      - Are you backing up lots and lots of little files, which will
        involve more seeking?

  * If you have maxdumps set greater than one, more than one backup may
    be going on at the same time.  If these are backing up separate areas
    on the same physical disk, seeking will slow things down.  You can
    control this with the "spindle" field in the disklist file.

  * If you have client compression enabled, the dump image is run
    through either "compress" or "gzip", depending on how Amanda was
    built.  Using "best" translates to "-9" on gzip, which is known to
    take up to several times as long to compress the data, but with
    minimal improvement in compression ratio (depending on the data,
    of course).

    Also, if maxdumps > 1 and multiple backups are running at the same
    time, that will multiply the system load.

  * The image is then sent over the network to the server.  Are you
    seeing collisions or errors in the network statistics on either side?
    That's often an indication of a duplex mismatch.

    What kind of transfer rate to you get if you do a test ftp from
    client to /dev/null on the server of a few MBytes?

  * On the server side, if you have compression turned on (which cannot
    also be on if client side compression is enabled), the image is sent
    through a compression program and the same points as on the client
    apply.

  * If there was enough space in the holding disk, the image is written
    there.  The same questions that were asked about the speed of the
    client disk apply here.

    In addition, if you have multiple clients or maxdumps > 1, multiple
    images may be stored in the holding disk at the same time which will
    increase seeking.

  * If there is not enough holding disk space, the image will be written
    directly to tape.  How fast is the tape drive?  How fast is the
    interface to it?

    If you are using software compression on either the client or the
    server, make absolutely certain you do not also have hardware
    compression turned on.  In addition to expanding the amount of tape
    space used, this can slow down the transfer.

    As others have said, how hardware compression is enabled or disabled
    is highly OS specific, and you didn't mention what your server
    is running.

Note that I'm not asking you to post answers to these questions to
the list.  They are being presented for you to ask yourself, although,
of course, if you continue to have trouble you might want to share
the details.

My guess from what you've said is that the client is doing "best"
compression and that's killing the throughput.  I'd certainly suggest
dropping that back to "fast".  You might also consider which system is
faster, the client or the server, and shift to doing compression on the
server if it's faster and the extra network traffic (because the whole
image will be transmitted, not the compressed version) warrant.

There are other factors (TCP stream buffer size, routers, bent cable pins,
etc), but the above should be checked out and eliminated first.

>In amanda.conf:
>netusage 600 Kbps # maximum net bandwidth for Amanda, in KB per sec
>But we have a 100mbit LAN, so there should be 10240 kbps, thats right? Or what
> 
>is the meaning of this entry? Which values do you have?

This entry controls whether another backup may be started on a client.
For instance, if Amanda estimates a currently running backup is going
to use 500 Kbps (based on the estimated size and historical speed
information) and a candidate second backup is going to use 200 Kbps, the
second backup will be held up (500 + 200 > 600).  If the first backup is
estimated to use 250, the second will not be held up due to the bandwidth
(interface) limit (250 + 200 <= 600).

Note that several things go into deciding whether a second (or third,
etc) backup may be started -- is there enough holding disk space, is there
an available dumper (inparallel), will the client support another backup
(maxdumps), etc.  Network use is just one constraint.

The interface speed value does **not** control how fast Amanda moves data.
Once data starts flowing, all that is left entirely up to the OS and
hardware.

>In /etc/amanda/interfaces:
>define interface local {
>     comment "a local disk"
>     use 1000 kbps
>}

I use 10000 kbps.  Basically, this is to make sure the interface is not
a limit when dumping local disks (it could even be made much higher).

>define interface eth0 {
>     comment "100 Mbps ethernet"
>     use 10240 kbps
>}

I use 6000 kbps.  No particular reason, except it is somewhat less than
the theoretical bandwidth of 12800 kbps.

>Have to be an /etc/amanda/... - directory also on the client site (in our case
>the file server?)

Not sure what you have in /etc/amanda, but if you're asking if you
need configuration information on the client, such as amanda.conf, the
answer is "no".  Clients do not (currently) have configuration files.
Just the program binaries.

>Dalton

John R. Jackson, Technical Software Specialist, jrj AT purdue DOT edu

<Prev in Thread] Current Thread [Next in Thread>