Amanda-Users

Re: Client just started timing out.

2006-02-01 12:11:54
Subject: Re: Client just started timing out.
From: "Ram \"TK\" Krishnamurthy" <tk AT zmanda DOT com>
To: Stephen Carville <carville AT cpl DOT net>
Date: Wed, 01 Feb 2006 09:03:45 -0800
Does changing the timeout on the server have any effect?
Anything in the /tmp/amanda/ logs on server? Any indications in
syslog of a network burp?

Sorry, more questions than answers.

Thanks
tk


Stephen Carville wrote:
For about a week one of my amanda clients has been timing out during planning. If I go into /tmp/amanda on thames (the client) I see a file like:

$ cat amandad.20060131210039.debug
amandad: debug 1 pid 18530 ruid 250 euid 250: start at Tue Jan 31 21:00:39 2006
amandad: version 2.4.5p1
amandad: build: VERSION="Amanda-2.4.5p1"
amandad:        BUILT_DATE="Mon Jan 30 08:08:00 PST 2006"
amandad: BUILT_MACH="Linux thames 2.4.7-10 #1 Thu Sep 6 17:27:27 EDT 2001 i686 unknown"
amandad:        CC="gcc"
amandad: CONFIGURE_COMMAND="'./configure' '--with-user=amanda' '--with-group=adm'"
amandad: paths: bindir="/usr/local/bin" sbindir="/usr/local/sbin"
amandad:        libexecdir="/usr/local/libexec" mandir="/usr/local/man"
amandad:        AMANDA_TMPDIR="/tmp/amanda" AMANDA_DBGDIR="/tmp/amanda"
amandad:        CONFIG_DIR="/usr/local/etc/amanda" DEV_PREFIX="/dev/"
amandad:        RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad:        RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
amandad:        XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
amandad:        SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/gtar"
amandad:        COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
amandad:        LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
amandad:        listed_incr_dir="/usr/local/var/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="thames" DEFAULT_CONFIG="DailySet1"
amandad:        DEFAULT_TAPE_SERVER="thames"
amandad:        DEFAULT_TAPE_DEVICE="/dev/null" HAVE_MMAP HAVE_SYSVSHM
amandad:        LOCKING=POSIX_FCNTL SETPGRP_VOID DEBUG_CODE
amandad:        AMANDA_DEBUG_DAYS=4 BSD_SECURITY USE_AMANDAHOSTS
amandad:        CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 0.000: got packet:
--------
Amanda 2.4 REQ HANDLE 005-505AE508 SEQ 1138770050
SECURITY USER amanda
SERVICE noop
OPTIONS features=fffffeff9ffe7f;
--------

amandad: time 0.000: sending ack:
----
Amanda 2.4 ACK HANDLE 005-505AE508 SEQ 1138770050
----

amandad: time 0.002: bsd security: remote host amazon.totalflood.com user amanda local user amanda
amandad: time 0.013: amandahosts security check passed
amandad: time 0.013: running service "noop"
amandad: time 0.013: sending REP packet:
----
Amanda 2.4 REP HANDLE 005-505AE508 SEQ 1138770050
OPTIONS features=fffffeff9ffe7f;
----

amandad: time 0.013: got packet:
----
Amanda 2.4 ACK HANDLE 005-505AE508 SEQ 1138770050
----

amandad: time 0.013: pid 18530 finish time Tue Jan 31 21:00:39 2006

--

By comparison to other machines that do work, nothing in the above looks amiss.

Moving over to the server (amazon), I see in the log

DISK planner thames /
DISK planner thames /boot
DISK planner thames /export/common
DISK planner thames /export/edi
DISK planner thames /export/netapps
DISK planner thames /export/private
DISK planner thames /export/public
DISK planner thames /var

And then a few lines later

FAIL planner thames /var 20060131 0 [Request to thames timed out.]
FAIL planner thames /export/public 20060131 0 [Request to thames timed out.] FAIL planner thames /export/private 20060131 0 [Request to thames timed out.] FAIL planner thames /export/netapps 20060131 0 [Request to thames timed out.]
FAIL planner thames /export/edi 20060131 0 [Request to thames timed out.]
FAIL planner thames /export/common 20060131 0 [Request to thames timed out.]
FAIL planner thames /boot 20060131 0 [Request to thames timed out.]
FAIL planner thames / 20060131 0 [Request to thames timed out.]

However, there is no indication on thames that the planner packet(s) were ever received. Shouldn't I at least see a sendsize.nnnnnn.debug file?

Amazon is running amanda 2.4.5. Thames was running 2.4.3b2 when this all started but is now running 2.4.5p1.

Fortunately in this case, none of the above is big problem. Thames is scheduled to be replaced soon and all of the "important" data is being mirrored to its replacement. Nevertheless, it is a bit frustrating to a have my backups mysteriously quit working on me.

Suggestions are welcome.


--


Ram "TK" Krishnamurthy

http://www.zmanda.com


<Prev in Thread] Current Thread [Next in Thread>