2.4.2p2 client, 2.4.4p3 server: timeout from amandad...
2006-05-03 07:29:06
Hello everyone,
I'm in the process of migrating a full backup configuration from one machine
to another. The problem with the old one is threefold:
- it's a mess, which happens to be our main internal LAN server,
- it's obsolete software wise,
- the DAT changer attached to it now has insufficient capacity.
Such a mess it is that I cannot afford to install a new Amanda version on it
and see if that would cure the problem - some dumps are left on the holding
disk but then I can copy them around.
So, the new machine has a brand new DAT72x6 HP changer (DAT24x6 on the old
one), I've copied the old Amanda configuration to the new one and adapted
some settings (tapetype, in essence). Now the configuration looks like this:
--------------------
org "One2team"
mailto "[email protected]"
dumpuser "amanda"
inparallel 4
netusage 5000 Kbps
dumpcycle 0 days
runspercycle 1 days
tapecycle 2 tapes
bumpsize 20 Mb
bumpdays 1
bumpmult 4
etimeout 1200
runtapes 1
tpchanger "/usr/lib/amanda/chg-zd-mtx"
tapedev "/dev/nst0"
changerfile "changer.conf"
changerdev "/dev/sg1"
tapetype HP-DAT72
labelstr "^full-[0-9][0-9]*$"
holdingdisk hd1 {
comment "main holding disk"
directory "/var/lib/amanda/full/dumps"
use -1024 Mb
}
reserve 30
infofile "/var/lib/amanda/full/info"
logdir "/var/lib/amanda/full/logs"
indexdir "/var/lib/amanda/full/index"
define tapetype HP-DAT72 {
comment "Produced by tapetype prog (hardware compression off)"
length 37511 mbytes
filemark 625 kbytes
speed 1758 kps
}
----
The dump types are defined like this (relevant settings only AFAICS):
----
define dumptype global {
comment "Global definitions"
exclude "./tmp"
}
define dumptype root-tar {
global
program "GNUTAR"
comment "root partitions dumped with tar"
compress none
index
exclude list "/etc/amanda/exclude.gtar"
priority low
}
define dumptype comp-root-tar {
root-tar
comment "Root partitions with compression"
compress server fast
}
--------------------
The list of filesystems represent 24 Gb total (compressed with gzip). The
problem is this: it works fine when I try and backup every directory but one
of the two largest (which are resp. 8.4 Gb and 10 Gb uncompressed on disk),
and fails when I try to include either of these because _amandad_, not
amdump, times out. I get this in the amandad logfile:
--------------------
amandad: debug 1 pid 6636 ruid 33 euid 33 start time Wed May 3 12:15:04 2006
amandad: version 2.4.2p2
amandad: build: VERSION="Amanda-2.4.2p2"
[blah blah]
Amanda 2.4 REQ HANDLE 000-9006B109 SEQ 1146651283
SECURITY USER amanda
SERVICE sendsize
OPTIONS features=fffffeff9ffe0f;maxdumps=1;hostname=crios.olympe.o2t;
GNUTAR /usr/local 0 1970:1:1:0:0:0 -1 exclude-file=./tmp
[more blah]
sending ack:
----
Amanda 2.4 ACK HANDLE 000-9006B109 SEQ 1146651283
----
bsd security: remote host circe.olympe.o2t user amanda local user amanda
amandahosts security check passed
amandad: running service "/usr/lib/amanda/sendsize"
amandad: sending REP packet:
----
Amanda 2.4 REP HANDLE 000-9006B109 SEQ 1146651283
OPTIONS maxdumps=1;
/etc 0 SIZE 5990
/var/named 0 SIZE 40
[goes on and reports the rest]
----
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying
amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, giving up!
amandad: pid 6636 finish time Wed May 3 12:20:06 2006
--------------------
Reproducible at will: amandad always times out after 5 minutes. Meanwhile,
amdump stays there waiting for... Well, I don't know, frankly, but I have to
C-c it and amcleanup afterwards.
What I've already done is increase the etimeout parameter on the server side:
I put 1200 instead of the default value, 300. But that didn't help. Out of
despair I even tried and changed this value in the old server config files,
in case amandad would try and read them :p But no.
It should also be noted that the client machine is such a mess that my
predecessor of a sysadmin created 6 aliases for interface eth0... I had to
bind amandad specifically to the address I wanted so that dumps could work in
the first place. But I don't see this having an influence here, since smaller
backups work perfectly...
I'd appreciate any hint on this one!
Thanks,
--
Francis Galiegue, fg AT one2team DOT com
One2team - 12bis rue de la Pierre Levée, 75011 PARIS - 0143381980
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- 2.4.2p2 client, 2.4.4p3 server: timeout from amandad...,
Francis Galiegue <=
|
|
|