ADSM-L

Re: TAPE_SIM_MIM_RECORD errors on AIX

2001-02-20 19:50:23
Subject: Re: TAPE_SIM_MIM_RECORD errors on AIX
From: David Bronder <david-bronder AT UIOWA DOT EDU>
Date: Tue, 20 Feb 2001 18:50:47 -0600
Brian Murphy wrote:
>
> I corrected the AAAx output by using the errupdate and errinstall parts of
> the Atape installation procedure.
>
> I am wondering though if these errors are normal, does that mean that they
> are nothing to worry about and that I should alter the error template to
> turn logging of these errors off.
>
> Does anyone know if there is ever cause to pay attention to these errors?
>
> Should I perhaps also do this for the TAPE_DRIVE_CLEANING errors logged as
> the tape library is dealing with this itself?

The cleaning errors are normal.  Generally you'll see two SIM_MIM_RECORD_3590
messages and the TAPE_DRIVE_CLEANING message logged to errpt.  The cleaning
messages (as long as they're not too frequent) are just informative.

The SIM/MIM messages can be lots of things.  To tell for sure, you need to
decode the sense data in the log entry.  If you have a hardware reference
for your tape drive, you can manually do that.  The two that accompany a
drive cleaning correspond to the "needs cleaning" notice and the "just been
cleaned" notice.

Other SIM/MIM messages can indicate more serious problems, so you should
investigate any that don't fit the SIM_MIM + CLEANING + SIM_MIM pattern.

You (or others) may find the script included below useful for investigating
SIM/MIM errors.  I wrote it after getting tired of decoding them by hand
for our 3590 drives.  It ignores the cleaning messages by default.  If
you're using other IBM drive models, you'll have to modify it accordingly.
It's not perfect or complete, but it works as long as you don't try to do
too many funky things with errpt options. :) (Requires Perl, probably v5.)

=Dave

--
Hello World.                                    David Bronder - Systems Admin
Hello World.                                    David Bronder - Systems Admin
Segmentation Fault                                     ITS-SPA, Univ. of Iowa
Core dumped, disk trashed, quota filled, soda warm.   david-bronder AT uiowa 
DOT edu

=====
#!/usr/local/bin/perl

#
# simmim - Parse errpt SIM/MIM records for cleaning and other messages.
#
# Usage:  simmim [-all] [errpt arguments]
#   where [-all] shows all output (including cleaning messages)
#     and [errpt arguments] can be any (sane) errpt argument except -j and -a
#
# History:
#   20000114    Initial version
#   20000201    Show only non-cleaning messages by default ("--all" or
#               any errpt options will show override)
#   20000203    Show all messages only with the "-l" errpt option or with
#               the "-all" option
#   20000404    Include drive firmware level in output
#

my $opts = join(' ', @ARGV);
my $all = $opts =~ /-[l(all)]/;
$opts =~ s/-+all\s*//;  # strip the "-all" if it's there

my $command = "/usr/bin/errpt -j D1A1AE6F -a $opts";
my $sep = '-' x 75 . "\n\n";

my ($data, @data);  # XXX - Poor $data gets badly overloaded...
my ($output, $count, $problem);

# Lookup table for SIM/MIM codes
# From IBM 3590 Hardware Reference, GA32-0331-01
my %lkeys  = ( "20" => "SIM Message Code", "24" => "Exception Message Code",
               "25" => "Service Message Code",
               "26" => "Service Message Severity Code" );
my %lookup = ( "20" => {
                 ( "00" => "No Message",
                   "41" => "Device Degraded - Call for Service",
                   "42" => "Device Hardware Failure - Call for Service",
                   "43" => "Service Circuits Failed, Operations Not Affected"
                           . " - Call for Service",
                   "55" => "Drive Needs Cleaning:  Load Cleaning Cartridge",
                   "57" => "Drive Has Been Cleaned" ) },
               "24" => {
                 ( "1"  => "Effect of Failure is Unknown",
                   "2"  => "Device Exception - No Performance Impact",
                   "3"  => "Exception on SCSI Interface (see bytes 28-29)",
                   "4"  => "Device Exception on ACF",
                   "5"  => "Device Exception on Operator Panel",
                   "6"  => "Device Exception on Tape Path",
                   "7"  => "Device Exception in Drive",
                   "8"  => "Cleaning Required",
                   "9"  => "Cleaning Done" ) },
               "25" => {
                 ( "1"  => "Repair Impact is Unknown",
                   "7"  => "Repair Will Disable Access to Device",
                   "9"  => "Clean Device",
                   "A"  => "Device Cleaned" ) },
               "26" => {
                 ( "0"  => 'SIM severity code "Service"',
                   "1"  => 'SIM severity code "Moderate"',
                   "2"  => 'SIM severity code "Serious"',
                   "3"  => 'SIM severity code "Acute"' ) } );

# Read the system error log
open(ERRPT, "$command |") || die "Can't run '$command': $!\n";

while(<ERRPT>) {
  # Skip some common lines we know we won't want
  next if /^[- ]*$/;

  $output .= $_, next if s/^(Date.Time:)\s*(.*)/$1                      $2/;
  $output .= $_, next if s/^(Sequence Number:)\s*(.*)/$1                $2/;
  $output .= $_, next if s/^(Resource Name:)\s*(.*)/$1                  $2/;
  if (/.*Type and (Model)\.*0*(\w+)/) {
    $output .= " " x 8 . "$1:" . " " x 20 . "$2\n";
    next;
  }
  if (/.*(Serial Number)\.*0*(\w+)/) {
    $output .= " " x 8 . "$1:" . " " x 12 . "$2\n";
    next;
  }
  if (/.*Device Specific\.\(FW\)\.*(\S+)/) {
    $output .= " " x 8 . "Firmware Level:" . " " x 11 . "$1\n";
    next;
  }

  # Get to the meat of the data, and chew it up
  if (/^DIAGNOSTIC EXPLANATION/) {
    undef $data;
    while(<ERRPT>) {
      last if /^-+$/;
      chomp;  $data .= $_;
    }
    $data =~ s/(\w\w)(\w\w)/$1 $2/g;
    @data = split(' ', $data);

    # Now print it nice and pretty
    foreach $byte (sort keys %lkeys) {
      $data = chr(hex($data[$byte]));
      $data .= chr(hex($data[$byte+1])) if ($byte == 20);
      #print "byte: $byte  data(hex): $data[$byte]  data(asc): $data\n";
      $problem++ if (($byte == 24) && ($data[$byte] !~ /^3[89]$/));
      $data = $lookup{$byte}{$data} || "Reserved or Unknown";
      $output .= sprintf("%-31s %s\n", $lkeys{$byte} . ":", $data);
    }

    if ($problem || $all) {
      print("$sep$output\n");
      $count++;
    }
    $problem = 0;  $output = "";
  }
}
print "$sep  $count record(s) printed\n\n" if $count;

close(ERRPT) || die "Error running '$command': $!\n";
<Prev in Thread] Current Thread [Next in Thread>