Amanda-Users

Re: Crashing machine

2003-09-18 17:38:38
Subject: Re: Crashing machine
From: Jim Summers <jsummers AT bachman.cs.ou DOT edu>
To: "Amanda Users (E-mail)" <amanda-users AT amanda DOT org>
Date: 18 Sep 2003 16:14:34 -0500
On Thu, 2003-09-18 at 15:42, Brashers, Bart -- MFG, Inc. wrote:
> I've been using amanda-2.4.2p2 for a long time now, without problems.  In
> the last week or so, my Linux (2.4.20) machine has been crashing, apparently
> when amanda runs.  I see in the various logs in /var/log when amanda (e.g.
> xinetd in /var/log/secure with user amanda, from 127.0.0.1) and then nothing
> until the restart the next morning when I restart the computer.  

Bummer. I had a situation once where my backups all of a sudden began
failing on large filesystems.  Fortunately I caught a message in the log
files that pointed me to the NIC.

> 
> The real kicker was just now when I ran amflush (after amcleanup) to flush
> the last failed dump to the disk.  The system panicked after just a few
> minutes, with the "Machine check exception (kernel panic: cpu context
> corrupt)" error.  That usually happens when the system is too hot, or you
> have a bad motherboard, or something.  This machine has been in operation
> for about 6 months, so it's probably not the MB.  It's not that hot in the
> room, and I checked that the fins on the CPU fan weren't clogged with dust.
> 

Can you use lm_sensors to monitor the internal temps?  It helped me find
a problem on a node in our cluster.  The node would be humming along
fine then when it got a fairly CPU intensive job running on it, then bam
it would hang, no log messages either.

Hope this helps.


> Any ideas here?  Anyone heard of such a thing?  Am I barking up the wrong
> tree thinking that amanda might be responsible for my crashes?  It's a real
> pain, not being able to run stuff at night (and not having backups makes me
> nervous).
> 
> Bart
> --
> Bart Brashers, Ph.D.
> Air Quality Meteorologist
> MFG Inc.
> 19203 36th Ave W Suite 101
> Lynnwood WA 98036-5707
> 
> bart.brashers AT mfgenv DOT com
> Phone: 425.921.4000
> Fax:   425.921.4040
-- 
Jim Summers <jsummers AT cs.ou DOT edu>
University of Oklahoma - Computer Science


<Prev in Thread] Current Thread [Next in Thread>