Amanda-Users

Re: More ACK Woes

2006-09-12 15:35:25
Subject: Re: More ACK Woes
From: Jon LaBadie <jon AT jgcomp DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 12 Sep 2006 14:01:28 -0400
On Tue, Sep 12, 2006 at 09:47:39AM -0600, Steven Backus wrote:
> Last night my 2.5.1 backups failed spectacularly:
> 
>   ambiance.med.utah.edu  sdc1                lev 0  FAILED [cannot read 
> header: got 0 instead of 32768]
>   episun7.med.utah.edu   c0t11d0s0           lev 1  FAILED [cannot read 
> header: got 0 instead of 32768]
>   ambiance.med.utah.edu  sdc1                lev 0  FAILED [too many dumper 
> retry: "[request failed: timeout waiting for ACK]"]
>   ambiance.med.utah.edu  sdc1                lev 0  FAILED [cannot read 
> header: got 0 instead of 32768]
   <big snip>
> 
> I've changed common-src/protocol.c to:
> 
> #define ACK_WAIT 100 /* time (secs) to wait for ACK - keep short */
> #define ACK_TRIES 10 /* num retries after ACK_WAIT timeout */
> 
> and still no joy.  I logged in last night and found 12 gtar
> processes running on ambiance with the load average around 15!
> Such strange behavior, I have inparallel set to 6 in my
> amanda.conf, could this have something to do with it?
> 

Refresh our collective minds, is ambience your server (and a client)
or is it a client only?

When you changed inparallel to 6 (a reduction from default 10)
did you also change maxdumps?

My understanding of the two parameters is:

maxdumps refers to the clients -- how many simultaneous dumps
can be running on any one client.  I.e. how many backups can
each client stand to run at the same time.  Typically 1 but
I often set it to 2 unless there are complaints.

inparallel refers to the server -- how many client dumps total
can be running at the same time.  I.e. how many client dumps
can the server cope with streaming into it at the same time.
It has to deal with putting them to the holding disk and possibly
taping one at the same time.  Of course if it also backups up
itself, that is additional load.

I just reread the amanda.conf man page, and if my understanding
is correct, the client/server distinction is not very clear.


For you to have 12 gtar processes running, it sounds like maxdumps
is set to 6 also (raised from 1).  It gives 12 processes because
you have 6 DLEs being dumped and each takes two gtar's, one for
the dump and one to generating the index.

-- 
Jon H. LaBadie                  jon AT jgcomp DOT com
 JG Computing
 4455 Province Line Road        (609) 252-0159
 Princeton, NJ  08540-4322      (609) 683-7220 (fax)

<Prev in Thread] Current Thread [Next in Thread>