Amanda-Users

Weird, inconsistent amcheck timeouts

2009-03-23 13:16:26
Subject: Weird, inconsistent amcheck timeouts
From: Clint Allen <callen AT kpi-consulting DOT net>
To: amanda-users AT amanda DOT org
Date: Mon, 23 Mar 2009 11:29:54 -0500 (CDT)
Hello all, I've been reading the list for a few weeks now, while getting my 
first Amanda installation set up.  I love the software, but (you knew that was 
coming) I'm having a lot of trouble with amcheck/amdump timing out waiting on 
client ACKs.  I've done the troubleshooting steps on the wiki 
(http://wiki.zmanda.com/index.php/Selfcheck_request_failed) several times and 
managed to break it in other ways, but always come back to this timeout issue.
Initially I had it set up on a small test network: one server, three clients, 
everything on the same switch.  I made a vtapelib on disk and let the dumps go 
for a couple weeks, did some restores, just to get familiar with things before 
moving it to our colo.  Everything worked flawlessly.
I moved the server to the colo, got a disklist set up with 15 machines, and 
enabled the daily cron job.  Everything seemed ok until a few hours later, when 
the 'amcheck -m daily' job ran and reported several timeouts (including 
localhost, which I thought was odd).  As I mentioned earlier I went through all 
the troubleshooting I could find on the wiki, and Google; I've spent about two 
weeks on this so far.  I can't tell if it's something up with the network at 
the colo, but I can get these two things to happen consistently:

1. When I run 'amcheck daily', there are _almost_ always timeouts, but 
sometimes not.  It's very sporadic.
2. When I run 'amcheck daily [hostname]', it works EVERY time.  I even set up a 
cron job to run amcheck individually for each client, every 15 mintues, and it 
never failed (left it running for about 48 hours).  This is the thing that 
stumps me the most.

The only thing I can think to do is use a similar cron job to run amdump...but 
I'd really rather use the software as intended.  If you guys have any ideas, 
I'm open.  I've tried everything I can think of.  If you need debug output, 
etc. I will be happy to post it.

--------------------------------------
Clint Allen
Systems Administrator - KPI Consulting
512-218-1001 ext. 617

NOTICE: This e-mail message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information.  Any unauthorized 
review, use, disclosure or distribution is prohibited. Nothing contained in 
this message or in any attachment shall constitute a contract or electronic 
signature under the Electronic Signatures in Global and National Commerce Act, 
any version of the Uniform Electronic Transactions Act or any other statute 
governing electronic transactions.

<Prev in Thread] Current Thread [Next in Thread>
  • Weird, inconsistent amcheck timeouts, Clint Allen <=