Amanda-Users

Estimate timeout

2004-06-09 13:02:21
Subject: Estimate timeout
From: "Steven Schoch" <stevenschoch AT hotmail DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 09 Jun 2004 09:56:23 -0700
It was working for several days, then all of a sudden it stopped and hasn't worked since.

Amcheck works fine, but amdump doesn't.

Amdump is run on homer, the system with the tape drive. Homer is a RedHat Enterprise Linux system with amanda version 2.4.4p1. The system that fails to dump is marge, a FreeBSD system with amanda version 2.4.4p2.

The important lines from amanda.conf:
----
etimeout 1800    # number of seconds per filesystem for estimates.
#etimeout -600   # total number of seconds for estimates.
# a positive number will be multiplied by the number of filesystems on
# each host; a negative number will be taken as an absolute total time-out.
# The default is 5 minutes per filesystem.
----

From disklist:
----
marge /var comp-user
marge /usr comp-root
marge / comp-root
----

From crontab:
----
45 0 * * 2-6    /usr/sbin/amdump OurDump
----

In /tmp/amanda on marge, these lines appear in amandad.20040609004501000.debug:
----
amandad: debug 1 pid 22611 ruid 1001 euid 1001: start at Wed Jun 9 00:45:01 200
4
amandad: version 2.4.4p2
amandad: build: VERSION="Amanda-2.4.4p2"
...
amandad: time 0.003: got packet:
--------
Amanda 2.4 REQ HANDLE 001-389B0608 SEQ 1086767104
SECURITY USER amanda
SERVICE sendsize
...
amandad: time 0.004: sending ack:
----
Amanda 2.4 ACK HANDLE 001-389B0608 SEQ 1086767104
...
amandad: time 0.009: amandahosts security check passed
amandad: time 0.009: running service "/usr/local/libexec/sendsize"
amandad: time 447.906: sending REP packet:
----
Amanda 2.4 REP HANDLE 001-389B0608 SEQ 1086767104
OPTIONS features=fffffeff9ffe0f;
/var 0 SIZE 11520
/var 1 SIZE 1580
/usr 0 SIZE 1166599
/usr 1 SIZE 18710
/ 0 SIZE 39571
/ 1 SIZE 381
----

amandad: time 457.910: dgram_recv: timeout after 10 seconds
amandad: time 457.910: waiting for ack: timeout, retrying
amandad: time 467.920: dgram_recv: timeout after 10 seconds
amandad: time 467.920: waiting for ack: timeout, retrying
amandad: time 477.930: dgram_recv: timeout after 10 seconds
amandad: time 477.930: waiting for ack: timeout, retrying
amandad: time 487.940: dgram_recv: timeout after 10 seconds
amandad: time 487.941: waiting for ack: timeout, retrying
amandad: time 497.950: dgram_recv: timeout after 10 seconds
amandad: time 497.951: waiting for ack: timeout, giving up!
amandad: time 497.951: pid 22611 finish time Wed Jun  9 00:53:19 2004



On homer, in amdump.1 these lines:
----
amdump: start at Wed Jun  9 00:45:01 PDT 2004
amdump: datestamp 20040609
planner: pid 9813 executable /usr/lib/amanda/planner version 2.4.4p1
planner: build: VERSION="Amanda-2.4.4p1"
...
setup_estimate: marge:/var: command 0, options:
   last_level 0 next_level0 21 level_days 0
   getting estimates 0 (11503) 1 (0) -1 (-1)
planner: time 0.125: setting up estimates for marge:/usr
setup_estimate: marge:/usr: command 0, options:
   last_level 0 next_level0 21 level_days 0
   getting estimates 0 (1163201) 1 (0) -1 (-1)
planner: time 0.135: setting up estimates for marge:/
setup_estimate: marge:/: command 0, options:
   last_level 0 next_level0 21 level_days 0
   getting estimates 0 (39486) 1 (0) -1 (-1)
...
planner: time 223.483: got result for host homer disk /home: 0 -> 4642543K, 4 ->
899568K, -1 -> -1K
planner: time 10801.886: error result for host marge disk /: Estimate timeout fr
om marge
planner: time 10801.886: error result for host marge disk /usr: Estimate timeout
from marge
planner: time 10801.886: error result for host marge disk /var: Estimate timeout
from marge
planner: time 10801.886: getting estimates took 10801.690 secs



It looks like homer was waiting a suffcient time for marge to reply, but the reply was dropped.
Marge and homer are on the same switch.
--
Steve

_________________________________________________________________
Get fast, reliable Internet access with MSN 9 Dial-up ? now 3 months FREE! http://join.msn.click-url.com/go/onm00200361ave/direct/01/


<Prev in Thread] Current Thread [Next in Thread>