Amanda-Users

Re: request failed: timeout waiting for ACK

2006-03-10 07:10:34
Subject: Re: request failed: timeout waiting for ACK
From: Stefan Herrmann <magic99de AT web DOT de>
To: Paul Bijnens <paul.bijnens AT xplanation DOT com>
Date: Fri, 10 Mar 2006 13:07:04 +0100

Am 10.03.2006 um 12:16 schrieb Paul Bijnens:
On 2006-03-10 11:19, Stefan Herrmann wrote:
hello list,
didnt find a solution for this problem yet, and need it urgently.
this is a summary of what happened:
system is:
FreeBSD pille.hq.imos.net 5.4-RELEASE-p3 FreeBSD 5.4-RELEASE-p3 #0: Sat Jul 2 16:02:43 CEST 2005 root AT pille.hq.imos DOT net:/usr/obj/usr/src/sys/IMOS i386
amanda versions:
server: 2.5.0b2
client: 2.4.5p1

Only 1 system, but two amanda versions...
I presume the client and server both run the same OS version, but are
different machines.

ok, my fault :)

the line above ist from the client, the server has:

Linux amanda 2.6.8-2-386 #1 Tue Aug 16 12:46:35 UTC 2005 i686 GNU/Linux

"amstatus daily" tells the following:
pille.hq.imos.net:/ 0 252m finished (2:08:23) pille.hq.imos.net:/opt 1 driver: (aborted:[request failed: timeout waiting for ACK])(too many dumper retry) pille.hq.imos.net:/usr 0 3505m finished (1:53:08) pille.hq.imos.net:/var 0 driver: (aborted:[request failed: timeout waiting for ACK])(too many dumper retry) as you can see, parts of the backup are done, others get aborted. reason is that often the client does not answer the request from the amanda server. that is what i can see from a tcpdump output and
from amandad.*.debug on the client:
[...]
amandad: time 30.004: dgram_recv: timeout after 30 seconds
amandad: error receiving message: timeout
amandad: time 30.004: error receiving message: timeout
amandad: time 30.004: pid 64288 finish time Fri Mar 10 02:05:55 2006

You omitted the useful information just above, but from what I can see
is that the client amandad  apparently is getting started by (x)inetd,
but that when it tries to read the packet, there is nothing.

there was no useful information above, so i omitted it.

Wild guess...
Is your inetd service for amanda configured to "wait" or "nowait"?
It should be "wait".  (xinetd uses syntax "wait = yes".)

yes, your guess is true, it was "nowait", so i changed it to "wait", thanks for that :)

You said you also had tcpdump trace.
Run tcpdump both on server and client, and verify if the client indeed
receives what the server sends.  Any router/firewall between them?

server and client are in same network, no firewall between.

i already installed the fix for freebsd for large udp packets, so that should not be the problem.

Large UDP packets occur during estimate only, but you are already
in the dumping phase.  So that should be completely unrelated.

ok

i dont know what to do further, can anyone help ?

The "too many dumper retry" error is suspicious too.  Are these dumpers
special (e.g. bypassing holdingdisk, search "PORT-WRITE" in the amdump.X
file).  Any other useful info in amdump.X file on the server about this
problem (Out of swapspace on server? etc.)

Are it always the same DLE that are failing?

yes, always.

ok, i wait till tomorrow to see if the inetd.conf changes work, thanks for your help !

bye
Stefan Herrmann


<Prev in Thread] Current Thread [Next in Thread>