On 2006-05-03 13:26, Francis Galiegue wrote:
The list of filesystems represent 24 Gb total (compressed with gzip). The
problem is this: it works fine when I try and backup every directory but one
of the two largest (which are resp. 8.4 Gb and 10 Gb uncompressed on disk),
and fails when I try to include either of these because _amandad_, not
amdump, times out. I get this in the amandad logfile:
--------------------
amandad: debug 1 pid 6636 ruid 33 euid 33 start time Wed May 3 12:15:04 2006
amandad: version 2.4.2p2
[...]
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, giving up!
amandad: pid 6636 finish time Wed May 3 12:20:06 2006
--------------------
Reproducible at will: amandad always times out after 5 minutes. Meanwhile,
amdump stays there waiting for... Well, I don't know, frankly, but I have to
C-c it and amcleanup afterwards.
What I've already done is increase the etimeout parameter on the server side:
I put 1200 instead of the default value, 300. But that didn't help. Out of
despair I even tried and changed this value in the old server config files,
in case amandad would try and read them :p But no.
You could run tcpdump or ethereal on the server and client and verify
if indeed the packet is arriving there and with the correct IP-number
(considering the aliases for eth0 can have messed that up).
Is there a firewall inbetween, or on one/both of the servers?
You may need to increase the UDP-reply timeout on the firewall (or
disable the firewall). I believe many firewalls timeout UDP packets
after 180 seconds.
There are other possibilities, solutions. See:
http://wiki.zmanda.com/index.php/Amdump:_results_missing
in amanda 2.4.2p2 the "calcsize" was not yet implemented (it did
exist, but was experimental, I believe).
if the server is 2.4.4xx then you can use "estimate server", even
if the client is old.
It should also be noted that the client machine is such a mess that my
predecessor of a sysadmin created 6 aliases for interface eth0... I had to
bind amandad specifically to the address I wanted so that dumps could work in
the first place. But I don't see this having an influence here, since smaller
backups work perfectly...
I'd appreciate any hint on this one!
--
Paul Bijnens, xplanation Technology Services Tel +32 16 397.511
Technologielaan 21 bus 2, B-3001 Leuven, BELGIUM Fax +32 16 397.512
http://www.xplanation.com/ email: Paul.Bijnens AT xplanation DOT com
***********************************************************************
* I think I've got the hang of it now: exit, ^D, ^C, ^\, ^Z, ^Q, ^^, *
* F6, quit, ZZ, :q, :q!, M-Z, ^X^C, logoff, logout, close, bye, /bye, *
* stop, end, F3, ~., ^]c, +++ ATH, disconnect, halt, abort, hangup, *
* PF4, F20, ^X^X, :D::D, KJOB, F14-f-e, F8-e, kill -1 $$, shutdown, *
* init 0, kill -9 1, Alt-F4, Ctrl-Alt-Del, AltGr-NumLock, Stop-A, ... *
* ... "Are you sure?" ... YES ... Phew ... I'm out *
***********************************************************************
|