On Tue, Sep 12, 2006 at 09:47:39AM -0600, Steven Backus wrote:
> Last night my 2.5.1 backups failed spectacularly:
>
> ambiance.med.utah.edu sdc1 lev 0 FAILED [cannot read
> header: got 0 instead of 32768]
> episun7.med.utah.edu c0t11d0s0 lev 1 FAILED [cannot read
> header: got 0 instead of 32768]
> ambiance.med.utah.edu sdc1 lev 0 FAILED [too many dumper
> retry: "[request failed: timeout waiting for ACK]"]
> ambiance.med.utah.edu sdc1 lev 0 FAILED [cannot read
> header: got 0 instead of 32768]
<big snip>
>
> I've changed common-src/protocol.c to:
>
> #define ACK_WAIT 100 /* time (secs) to wait for ACK - keep short */
> #define ACK_TRIES 10 /* num retries after ACK_WAIT timeout */
>
> and still no joy. I logged in last night and found 12 gtar
> processes running on ambiance with the load average around 15!
> Such strange behavior, I have inparallel set to 6 in my
> amanda.conf, could this have something to do with it?
>
Refresh our collective minds, is ambience your server (and a client)
or is it a client only?
When you changed inparallel to 6 (a reduction from default 10)
did you also change maxdumps?
My understanding of the two parameters is:
maxdumps refers to the clients -- how many simultaneous dumps
can be running on any one client. I.e. how many backups can
each client stand to run at the same time. Typically 1 but
I often set it to 2 unless there are complaints.
inparallel refers to the server -- how many client dumps total
can be running at the same time. I.e. how many client dumps
can the server cope with streaming into it at the same time.
It has to deal with putting them to the holding disk and possibly
taping one at the same time. Of course if it also backups up
itself, that is additional load.
I just reread the amanda.conf man page, and if my understanding
is correct, the client/server distinction is not very clear.
For you to have 12 gtar processes running, it sounds like maxdumps
is set to 6 also (raised from 1). It gives 12 processes because
you have 6 DLEs being dumped and each takes two gtar's, one for
the dump and one to generating the index.
--
Jon H. LaBadie jon AT jgcomp DOT com
JG Computing
4455 Province Line Road (609) 252-0159
Princeton, NJ 08540-4322 (609) 683-7220 (fax)
|