Re: hosts timing out on amdump but not amcheck

Regarding the time outs, look in the logs on the client, forsendsize*debug (I think) -- see how long it takes to get the estimatefor the various drives. Then, check to see that your etimeout is largeenough...


HTH,
Ricky


On Friday, March 7, 2003, at 12:20  PM, justin m. clayton wrote:

Thanks for the help. Unfortunately, this has not proved to be thesolutionto my problem. Suddenly, however, one of the machines (still onautoneg,btw) began working, without any warning or me touching it. All othersare

still timing out on amdump (though amcheck still works).

--Justin

On Thu, 13 Feb 2003, Amanda Admin wrote:

Justin,

This sounds like the symptoms others on the amanda list haveattributed to

half/full duplex network interface and/or switch problems.

I seem to recall a posting just recently saying that (the built-in eth

interface on ??) Solaris had a particular affinity towards incorrectduplexdetection. Maybe a search of the archives on this topic wwill turn upsome

details.

Doug

-----Original Message-----
From: owner-amanda-users AT amanda DOT org
[mailto:owner-amanda-users AT amanda DOT org]On Behalf Of justin m. clayton
Sent: Thursday, February 13, 2003 2:37 PM
To: Joshua Baker-LePain
Cc: amanda-users AT amanda DOT org
Subject: Re: hosts timing out on amdump but not amcheck


On Thu, 13 Feb 2003, Joshua Baker-LePain wrote:

On Thu, 13 Feb 2003 at 9:15am, justin m. clayton wrote

First of all, thanks to all who helped me track down my NAK

problems from

last week. Having fixed that, all backup hosts pass amcheck

with flying

colors. However, when it comes time for the amdump, my log

report claims

"Request to <host> timed out" when I return the following morning.
However, if I run amcheck again, no hosts report problems.

This has been

going on for a number of days now. I am getting "Read error at byte
0...:Bad file number" on some hosts (via

/tmp/amanda/sendsize.*), and some

are reporting "amandad: dgram_recv: timeout after 10 seconds
amandad: waiting for ack: timeout, retrying" (via

/tmp/amanda/amandad.*).

Strangely, though, the symptom is the same for all machines.


What OS/distro?  Are there firewalls in the way?

The clients are Solaris 8, the server is stable Debian linux, bothusing2.4.2p2. No firewalls in the way. This configuration has worked inthe

past.

Justin Clayton
VLSI Research System Administrator
University of Washington
Electrical Engineering Dept
justincl AT u.washington DOT edu
206/543.2523  EE/CSE 307E


Justin Clayton
VLSI Research System Administrator
University of Washington
Electrical Engineering Dept
justincl AT u.washington DOT edu
206/543.2523  EE/CSE 307E