Hi,
This is confusing me a bit, I hope someone hear has an idea of what might be
happening.
I have been running an amanda server, backing itself up + one other amanda
client (jayhawker), for about a week now. It works great every night when I
have the amdump run. Yesterday I added a third amanda client, "bcc1". bcc1 and
jayhawker are both fresh RedHat 9 installs.
I configured bcc1 exactly the same was as jayhawker, with the same entries in
hosts.allow / hosts.deny / xinetd.conf / /home/amanda/.amandahosts etc. Both
mybox (by name and by ip just in case) and localhost (by localhost /
localhost.localdomain / 127.0.0.1) are allowed in hosts.allow for the user
amanda... I know a lot of that is redundant but I wanted to be 100% sure I
allowed the right things, since at least the .amandahosts file has been a bit
picky. Also of course my amanda server "mybox" is set up ok in /etc/hosts.
At first "mybox" could back up bcc1 just fine. I ran amcheck and there were 0
problems in 3 clients found. The first amdump worked yesterday afternoon. Then
overnight amdump ran from cron and was unable to connect to bcc1. Actually 95%
of the DLEs were backed up ok but /usr on bcc1 failed:
192.168.2. /usr lev 0 FAILED 20030530[could not connect to 192.168.2.200]
This morning after reading that in the amanda report, I ran amcheck and it said
selfcheck host down when trying bcc1 . Just to see if I could get to it, I
tried "ping bcc1" which started pinging the right IP immediately, no problems
at all. I ran amcheck again without changing anything else and it found 0
problems. Then I ran amdump and somehow by the time it had finished, the
problem came back, because *all* of the DLEs had FAILED messages saying could
not connect. I had to leave the building for a bit, and when I came back,
amcheck repeatedly says host down, even after I ping bcc1 (which still works
great). I checked the /var/log/secure and /var/log/messages but I don't see
anything strange at all, as far as I can tell. the amanda service is still
running on the client and nothing has changed in the firewalls etc. I double
checked all the things mentioned in the FAQ but everything seems to still be
set up just fine.
My disklist file on the server uses "bcc1" for the client name, but just for
kicks I tried changing it to the client's IP, and now it's saying that ip won't
do a selfcheck either. Also my timeouts were set to at least 30 seconds in
amanda.conf, amcheck was waiting a good long while before giving up, plus the
boxes are all on a LAN so when amcheck works it usually only takes it less than
a second to finish.
Any ideas of why a client would work for a while then randomly not be able to
do a selfchecK? The other amanda client is still working great...
Thanks!
_______________________
Jeremy Martin
Network Technician
http://www.gsi-kc.com
mailto:jmartin AT gsi-kc DOT com
|