Amanda-Users

Re: sendsize finishes, planner doesn't notice...

2007-10-12 10:51:22
Subject: Re: sendsize finishes, planner doesn't notice...
From: Jean-Louis Martineau <martineau AT zmanda DOT com>
To: Paul Lussier <pll AT permabit DOT com>
Date: Fri, 12 Oct 2007 10:43:19 -0400
Paul Lussier wrote:
Jean-Louis Martineau <martineau AT zmanda DOT com> writes:

Why you never posted the error in the amandad debug file?

I thought I had.  I've got etimeout set to 72000, so seeing it timeout
near 21000 set off alarms for me.

-------------------------------------------------------------------------------
amandad: time 21603.544: /usr/local/libexec/sendsize timed out waiting
for REP data
amandad: time 21603.781: sending NAK pkt:
<<<<<
ERROR timeout on reply pipe
-------------------------------------------------------------------------------

amanda have a timeout of 6 hours (21600 seconds).

Can you point me to where in the docs this is mentioned?  I've never
seen this menioned before (though I wasn't really looking for it) and
I can't seem to find it anywhere right now (running on no sleep and no
caffeine!)

It's not documented, it's not a server limit, it's a client limit we added do be sure amandad will eventually terminate.
You can change it in amanda-src/amandad.c
Change the value of REP_TIMEOUT.

Since the estimate is really slow, you could try calcsize or server.

I had intentionally avoided using either of those because:

 a) I'm trying to set up a new configuration which has not history and
    'server' option indicates it needs historical data to estimate with.

 b) I wanted to use 'client' to be as accurate as possible in order to
    create the historical data 'server' requires so I could eventually
    switch to that.
historical data are build from successful backup, first estimate will be way off, but it will learn.

You should add a spindle for dle on the same physical disk, it can be a lot faster.
I noticethat in 'man amanda.conf' for the "estimate" or
"(c,d,e)timeout" parameter there is no mention of what the maximum
timeout is (it must be in here somewhere, I'm just not finding it...)

I set my (e,d)timeout to 72000, or 20 hours. Could there be mention in
the documentation of what the max timeout is (21600) closer to the
various timeout parameters, *or* some kind of warning if amanda.conf
has timeout parameters which are set in excess of compiled in limits?

Also, is there some means of checking the amanda.conf file for these
types of parameter violations?  If not, I could probably come up with
a config-file parser/checker like this (with a little guidance) if
people were interested. My complete ignorance of the code base informs
me: "It's just a simple perl script. No, really!" :)
A solution could be to add an 'etimeout' in amanda-client.conf, amandad could use it instead of REP_TIMEOUT.
Maybe the server could send it's own timeout to amandad.