Amanda-Users

Re: Running amcheck, all hosts on one subnet timeout

2006-08-23 06:38:16
Subject: Re: Running amcheck, all hosts on one subnet timeout
From: "Peter Farrell" <peter.d.farrell AT gmail DOT com>
To: amanda-users AT amanda DOT org
Date: Wed, 23 Aug 2006 11:30:19 +0100
I have found the fix: (always more satisfying that way!)
--------------------------------------------------------------
- The tape server master has 3 interfaces (192.168.0.0 / 192.168.2.0 /
192.168.3.0)
- The hosts on subnet in question all have 2 interfaces (192.168.1.0 /
<in.addr>)
- DNS has my tape server (capella - 192.168.2.11 and and alias for
'amanda' to the same address)
- Using tcpdump / ethereal - packets were reaching the hosts on the
subnet, activating amandad but timing out.
- The routing table on the tape server says: for any request to
192.168.1.0 use the gateway of 192.168.0.100 (an internal DNS server)
so the request for that subnet was appearing as request from the
192.168.0.(n) interface on the tape server.
- The hosts on this subnet were trying to resolve that to no avail.

The fix: added the ip.addr for the tape server interface that has to
be routed through the gateway to the /etc/hosts file on all servers in
the subnet, (they read the file before the DNS) and everything works.

-Peter Farrell





On 23/08/06, Peter Farrell <peter.d.farrell AT gmail DOT com> wrote:
Hello all:

Problem:
Running amcheck, all hosts on one subnet timeout:
"<host> selfcheck request failed: timeout waiting for ACK"

Things I thought were odd: (common to all hosts in this subnet)
1. none of them ever created the /tmp/amanda folder or any debug info.
1a. I created them by hand and managed to get some meager output.
2. none of the amandad interaction ever announces
"amandad: time (n): got packet" rather, amandad is timing out w/ the
last statement: "amandad: time 30.066"

Compile Options:
--------------------
./configure'
'--prefix=/usr/local/amanda'
'--with-user=amanda'
'--with-group=backup'
'--with-debugging'
'--with-config=daily'
'--without-server'
'--with-index-server=amanda'
'--with-tape-server=amanda'"

- All hosts are Fedora Core -- all kernels are 2.4 & 2.6
- All versions client /server are Amanda-2.5.0p2-2006052
- All network settings / daemons seem to be set up.
- Firewalls are turned off when testing.
- Timeouts set in /proc are 7200.
- Backing up 250MB or less on each machine.
- Can ping both ways.
- Can log into Amanda Server from every hosts via 'amrecover'
- Routing tables are correct for everyone involved.
- /var/named is correct on all DNS servers
- nslookup / dig  resolve correctly.
- Compiled as 'amanda' made as 'root' - all files and directories
exist and have        correct permissions.
- "amanda" user can write to /tmp
- tape server is 2 subnets and '20 metres' of cable away.
- *Have not yet used tcpdump to track packets - but am assuming
(always a bad idea) that since I can ping and use 'amrecover' that
things are peachy.

---       SERVER CONFIGS / DEBUGS        ---

Amanda Backup Client Hosts Check
--------------------------------
WARNING: zeus.example.com: selfcheck request failed: timeout waiting for ACK
WARNING: venus.example.com: selfcheck request failed: timeout waiting for ACK
WARNING: andromeda.example.com: selfcheck request failed: timeout
waiting for ACK
WARNING: proxima.example.com: selfcheck request failed: timeout waiting for ACK
WARNING: pluto.example.com: selfcheck request failed: timeout waiting for ACK
WARNING: pollux.example.com: selfcheck request failed: timeout waiting for ACK
Client check: 21 hosts checked in 30.729 seconds, 6 problems found

(brought to you by Amanda 2.5.0p2)


---       CLIENT CONFIGS / DEBUGS        ---

=================================================================

amandad: debug 1 pid 11723 ruid 705 euid 705: start at Tue Aug 22 16:31:43 2006
amandad: version 2.5.0p2-20060525
amandad: build: VERSION="Amanda-2.5.0p2-20060525"
amandad:        BUILT_DATE="Tue Aug 22 15:50:20 BST 2006"
amandad:        BUILT_MACH="Linux proxima.example.com 2.4.20-30.9.um.1
#1 Thu Feb 19 11:44:47 JST 2004 i686 i686 i386 GNU/Linux"
amandad:        CC="gcc"
amandad:        CONFIGURE_COMMAND="'./configure'
'--prefix=/usr/local/amanda' '--with-user=amanda'
'--with-group=backup' '--with-debugging' '--with-config=daily'
'--without-server' '--with-index-server=amanda'
'--with-tape-server=amanda'"
amandad: paths: bindir="/usr/local/amanda/bin"
amandad:        sbindir="/usr/local/amanda/sbin"
amandad:        libexecdir="/usr/local/amanda/libexec"
amandad:        mandir="/usr/local/amanda/man" AMANDA_TMPDIR="/tmp/amanda"
amandad:        AMANDA_DBGDIR="/tmp/amanda"
amandad:        CONFIG_DIR="/usr/local/amanda/etc/amanda"
amandad:        DEV_PREFIX="/dev/" RDEV_PREFIX="/dev/" DUMP="/sbin/dump"
amandad:        RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
amandad:        XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF VXRESTORE=UNDEF
amandad:        SAMBA_CLIENT="/usr/bin/smbclient" GNUTAR="/bin/gtar"
amandad:        COMPRESS_PATH="/bin/gzip" UNCOMPRESS_PATH="/bin/gzip"
amandad:        LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
amandad:        listed_incr_dir="/usr/local/amanda/var/amanda/gnutar-lists"
amandad: defs:  DEFAULT_SERVER="amanda" DEFAULT_CONFIG="daily"
amandad:        DEFAULT_TAPE_SERVER="amanda" DEFAULT_TAPE_DEVICE="null:"
amandad:        HAVE_MMAP HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID
amandad:        DEBUG_CODE AMANDA_DEBUG_DAYS=4 BSD_SECURITY RSH_SECURITY
amandad:        USE_AMANDAHOSTS CLIENT_LOGIN="amanda" FORCE_USERID HAVE_GZIP
amandad:        COMPRESS_SUFFIX=".gz" COMPRESS_FAST_OPT="--fast"
amandad:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
amandad: time 30.066: pid 11723 finish time Tue Aug 22 16:32:13 2006

===================================================================

entries from /etc/hosts:
----------------------------
192.168.2.11    amanda.example.com         amanda
192.168.2.11    capella.example.com        capella

===================================================================

routing table is fine:
------------------------
[amanda@proxima amanda]$ netstat -rn
Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
203.148.137.0   0.0.0.0         255.255.255.224 U         0 0          0 eth0
192.168.2.0     192.168.1.3     255.255.255.0   UG        0 0          0 eth1
192.168.1.0     0.0.0.0         255.255.255.0   U         0 0          0 eth1
192.168.0.0     192.168.1.3     255.255.255.0   UG        0 0          0 eth1
169.254.0.0     0.0.0.0         255.255.0.0     U         0 0          0 eth1
127.0.0.0       0.0.0.0         255.0.0.0       U         0 0          0 lo
0.0.0.0         212.140.130.1   0.0.0.0         UG        0 0          0 eth0

===================================================================

can ping tape server:
--------------------------
[amanda@proxima amanda]$ ping amanda
PING amanda.example.com (192.168.2.11) 56(84) bytes of data.
64 bytes from amanda.example.com (192.168.2.11): icmp_seq=1 ttl=63 time=2.85 ms
64 bytes from amanda.example.com (192.168.2.11): icmp_seq=2 ttl=63 time=0.277 ms

===================================================================

can 'amrecover' to the tape server:
-------------------------------------------
[root@proxima sbin]# ./amrecover
AMRECOVER Version 2.5.0p2-20060525. Contacting server on amanda ...
220 capella AMANDA index server (2.5.0p2) ready.
200 Access OK
Setting restore date to today (2006-08-22)
200 Working date set to 2006-08-22.
Scanning /data/amanda_holding_disk...
200 Config set to daily.
...
...

===================================================================

netstat shows listening of correct ports:
------------------------------------------------
[root@proxima sbin]# netstat -nlp --udp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address
State       PID/Program name
udp        0      0 0.0.0.0:10080           0.0.0.0:*
         20343/xinetd

===================================================================

chkconfig shows amanda 'on':
-----------------------------------
[root@proxima sbin]# chkconfig --list | grep -w amanda
        amanda: on

===================================================================

xinetd looks good:
----------------------
[root@proxima sbin]# more /etc/xinetd.d/amanda
# default: on
# description:  The client for the Amanda backup system.\
#               This must be on for systems being backed up\
#               by Amanda.
service amanda
{
        disable = no
socket_type             = dgram
protocol                = udp
wait                    = yes
user                    = amanda
group                   = backup
groups                  = yes
server                  = /usr/local/amanda/libexec/amandad
}

===================================================================


<Prev in Thread] Current Thread [Next in Thread>