Amanda-Users

Re: Diagnosing an elusive fault on a critical system [long]

2002-08-19 17:10:24
Subject: Re: Diagnosing an elusive fault on a critical system [long]
From: Frank Smith <fsmith AT hoovers DOT com>
To: Jonathan Johnson <Jonathan.Johnson AT MinnetonkaSoftware DOT com>
Date: Mon, 19 Aug 2002 15:54:14 -0500
If I read your crontab comments correctly, you run amanda every night
but only have a tape in the drive on Friday nights?  If so, and if
the Friday night/Sat. AM run is the only time you have crashes, then
I would lean towards either a power supply problem or a scsi driver
problem, although Edwin Hakkennes' comments about bad RAM are a
possibility.
 Since bad power can manifest itself as all kinds of strange problems,
you might want to check your voltages under load before you get too
deep into hardware swapping.

Frank

--On Monday, August 19, 2002 15:07:54 -0500 Jonathan Johnson <Jonathan.Johnson AT 
minnetonkasoftware DOT com> wrote:

Dear Frank (et al.),

On Mon, 19 Aug 2002 14:16:22 -0500, Frank Smith <fsmith AT hoovers DOT com>
wrote:

 > I doubt it is an Amanda problem (you might want to also try the
 > linux-managers mailing list <http://www.linuxmanagers.org/> ), but
 > I'll toss out some suggestions of things to look at anyway:

Thanks for the mail list tip -- I've subscribed and will try that venue
as well.

 > If this really is going to be an 'omni-server', 128M seems a little
 > small.  Probably not your crash problem unless you're filling up
 > swap, which you seem to have enough of.

Some day I hope to upgrade the ram, which is why I partitioned an
amount of swap space (> 512 Mb) that is pretty ridiculous for a system
with only 128 Mb.

 > The 300W power supply may also be too small, especially if your tape
 > drives are internal.

I'm trying to figure out what the potential aggregate power consumption
of the system's components might be -- there are surprisingly few
technical specifications that state this, though!  :(

 > It could be the kernel.  We have had serious issues with the virtual
 > memory manger in a few of the mid 2.4 series, although the earlier
 > and later versions worked fine.

Perhaps I'll go with the latest RH 7.2 updated version then and try
patching it myself.  Is 2.4.9-34 an improvement on 2.4.9-31, though?

I've tried to remain RPM-based as much as possible to make life simpler
for everyone; I even used the amanda RPMs that came with RH 7.2 (and
I'd do it again, bub!).  Sometimes, though, one has to live on the
cutting edge...

 > To make it relevant to Amanda-users, what's special about Saturday?
 > Are you only running backups once a week, or do you run a different
 > config then?

Glad you asked.  Here's our Crontab, with e-mail addresses removed:

  # $Id: Crontab,v 1.5 2002/08/16 21:04:55 amanda Exp $
  #
  # Crontab entries for automated backup with amanda.
  #
  # The backup schedule expects no tape Sa-Th, but degraded dumps are
  # performed.

  20 12,16 * * 0-4,6      /usr/sbin/amcheck -clm DailySet1
  30 22 * * 0-4,6         /usr/sbin/amdump DailySet1

  # The degraded dumps are flushed Fr a.m.  There is no mt offline
  # because the amcheck.[N] file does not get generated with amflush -f
  # (and then the output goes to stdout).  So the first subsequent
  # amcheck will be responsible for seeing if we finished the flush,
  # ejected the tape, and inserted the requisite tape.

  20 9 * * 5      /usr/sbin/amcheck -m DailySet1
  50 9 * * 5      /usr/sbin/amcheck -M... DailySet1
  30 10 * * 5     echo -e "\ny\n" | /usr/sbin/amflush DailySet1

  # Expect to label a new tape on the afternoon of the last Friday of
  # each month

I was rather proud of this little snippet of shell programming...  :)

  10 12 * * 5     [ `date -d "1 week" +\%m` != `date +\%m` ] && \
                  /usr/sbin/amlabel DailySet1 `date +DailySet1o\%Y\%m`

  # The full dump is run Fr late p.m.  Part of our current setup is to
  # make sure that the Perforce depot is properly set up for backup.

  20 12 * * 5     /usr/sbin/amcheck -m DailySet1
  30 12 * * 5     ls DailySet1/index | \
                  xargs -l /usr/sbin/amadmin DailySet1 force > \
                  /dev/null
  20 13-16 * * 5  /usr/sbin/amcheck -M... DailySet1
  20 22 * * 5     ./p4backup
  30 22 * * 5     /usr/sbin/amdump DailySet1; \
                  /bin/mt -f /dev/st0 offline

  # The new tape labeled on the last Friday of the previous month
  # should be marked as not reusable.

  10 6 1 * *      /usr/sbin/amadmin DailySet1 no-reuse \
                  `date -d yesterday +DailySet1o\%Y\%m`

This should answer your question as well as display some ideas about
Amanda automation.  It works very well so far.

I thought about multiple configs, but then the coordination of indexes,
dump dates, etc. just gets needlessly complex.  So I just bully around
a single config.

 > Good luck,
 > Frank
 >
 > --On Monday, August 19, 2002 13:15:39 -0500 Jonathan Johnson <Jonathan.Johnson 
AT MinnetonkaSoftware DOT com> wrote:
 >
 > > <snip>

--
 /       Jonathan R. Johnson       | "Every word of God is flawless." \
 |    Minnetonka Software, Inc.    |                 -- Proverbs 30:5 |
 \ johnsonj AT MinnetonkaSoftware DOT com |  My own words only speak for me. /



--
Frank Smith                                                fsmith AT hoovers 
DOT com
Systems Administrator                                     Voice: 512-374-4673
Hoover's Online                                             Fax: 512-374-4501