Amanda-Users

Re: amandad: dgram_recv: timeout

2003-01-02 14:44:38
Subject: Re: amandad: dgram_recv: timeout
From: Jean-Louis Martineau <martinea AT iro.umontreal DOT ca>
To: David Raistrick <drais AT wow.atlasta DOT net>
Date: Thu, 2 Jan 2003 14:10:28 -0500
Hi David,

Increase ctimeout/etimeout in amanda.conf

The first and last line of amanda.*.debug will tell you the needed time.

Jean-Louis

On Thu, Jan 02, 2003 at 08:07:00AM -0800, David Raistrick wrote:
> 
> Hey folks.
> 
> I've been trying to solve a problem with amanda for the past few
> months.  Until yesterday it was only a problem on one (out of ~10
> servers) client.  Now it's two!
> 
> Examples from the dump report:
>   newww.gta. / lev 0 FAILED [Request to newww.gta.com timed out.]
>   bento.gta. / lev 0 FAILED [Request to bento.gta.com timed out.]
> 
> 
> bento has had the problem longer.  amcheck reports no errors with
> bento.  today, amcheck DOES report errors with newww;
> WARNING: newww.gta.com: selfcheck request timed out.  Host down?
> 
> Even though selfcheck...debug seems fine:
> /tmp/amanda%# more selfcheck.20030102105821.debug 
> selfcheck: debug 1 pid 61064 ruid 2 euid 2 start time Thu Jan  2 10:58:21
> 2003
> /usr/local/libexec/amanda/selfcheck: version 2.4.3b2
> selfcheck: checking disk /var
> selfcheck: device /var
> selfcheck: OK
> selfcheck: checking disk /usr
> selfcheck: device /usr
> selfcheck: OK
> selfcheck: checking disk /home
> selfcheck: device /home
> selfcheck: OK
> selfcheck: checking disk /
> selfcheck: device /
> selfcheck: OK
> selfcheck: pid 61064 finish time Thu Jan  2 10:58:21 2003
> 
> (ran it twice to be sure..same result, same report.)
> 
> The amandad..debug for this ends with:
> 
> amandad: It's not an ack
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, giving up!
> amandad: pid 61063 finish time Thu Jan  2 10:59:11 2003
> 
> ---
> 
> The amandad..debug on the client when amdump runs is similar:
> 
> <clip>
> 
> amandad: sending REP packet:
> ----
> Amanda 2.4 REP HANDLE 009-80350808 SEQ 1041408009
> OPTIONS maxdumps=1;
> / 0 SIZE 46800
> / 1 SIZE 46800
> /home 0 SIZE 547240
> /home 1 SIZE 547240
> /usr 0 SIZE 5118390
> /usr 1 SIZE 5119120
> /usr 2 SIZE 5119120
> /var 0 SIZE 179050
> /var 1 SIZE 179050
> ----
> 
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, retrying
> amandad: dgram_recv: timeout after 10 seconds
> amandad: waiting for ack: timeout, giving up!
> amandad: pid 13671 finish time Wed Jan  1 03:01:23 2003
> 
> -------
> 
> 
> Help! I'm open to any and all suggestions.
> 
> FWIW, bento and the amanda server are on the same ethernet switch.  newww
> and the amanda server are seperated by a firewall (which is, and has been,
> correctly configured.  Two other servers on the same network as newww
> still backup correctly.)
> 
> If you need any specific information from me, let me know and I can
> provide it.  I'm not yet sure what will help you folks help me.:)
> 
> thanks.
> 
> ...david
> 
> 
> 
> 
> 
> ---
> david raistrick
> drais AT atlasta DOT net              http://www.expita.com/nomime.html
> 

-- 
Jean-Louis Martineau             email: martineau AT IRO.UMontreal DOT CA 
Departement IRO, Universite de Montreal
C.P. 6128, Succ. CENTRE-VILLE    Tel: (514) 343-6111 ext. 3529
Montreal, Canada, H3C 3J7        Fax: (514) 343-5834

<Prev in Thread] Current Thread [Next in Thread>