ADSM-L

Re: Yikes LTO2 problem!?!

2003-06-25 14:22:31
Subject: Re: Yikes LTO2 problem!?!
From: Matthew Glanville <matthew.glanville AT KODAK DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 25 Jun 2003 14:22:04 -0400
Hmm,
  There are alot of IBMtape SENSE DATA messages...

I couldn't get any support people to know who to send this problem to, so I
downloaded the IBMtape manuals and SCSI reference manual. And tried to
tackle it myself.
Turned the IBMtape trace value to 2.

Wow those, SENSE DATA messages that were occurring now report an additional
line...
A bit easier to read and correlate with codes in the SCSI reference manual.
chk_sns: cmd 0xa(write), key/asc/asq 0x4/4b/0, defer 0, retry 301, rc 5

Ugh, all those obscure messages out at trace lvl 0, summarize em!
write_error, read_error, etc... ugh..

Also, I believe the problem errors are ones that TSM doesn't catch as
errors.... 0x4/4b/0 seems to be the culprit...

A breakdown of the various messages is as follows:
13490 total SENSE DATA messages
13216 of 0xa(write)0x4/4b/0 which are some sort of write error (not being
reported back to TSM)
7817 on drive 2 (out of 8 drives)
5395 on drive 1
4 on drive 3 (I disabled this drive due to write errors reported back to
TSM early on in the testing and waiting for it to get checked on)
6 of 0xa(write)0x4/44/0 that correlate to write errors reported back to
tsm, many on drive3 and 2.
The rest are read errors on various drives, that correlate to the read
errors occuring in TSM.

I am going to do a bit more correlation and trying to figure out what
drives are reporting the errors, seems to be just 1, 2 and 3...  hmmm.

Also I was eventually able to get IBM support to create a ticket and the
sense errors were sent to them.
I will find out what they say.

Matthew Glanville


Dave frost wrote>

      Have you had any interesting records in /var/adm/messages that you
can
      match up to when one or more of the tapes was mounted?  (ANR8468I
volume
      <x> dismounted is a good search key).

      We have only seen this on san-attached devices, and then only when a
RSCN
      has occurred on the fabric.  During reads or writes a block will be
      silently dropped.  Reads are recoverable...

<Prev in Thread] Current Thread [Next in Thread>