ADSM-L

Re: TapeAlerts on LTO2 drives, tapes

2006-06-28 14:24:10
Subject: Re: TapeAlerts on LTO2 drives, tapes
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 28 Jun 2006 14:23:33 -0400
Robin -

Don't forget to formulate a tracking Subject title for postings.  I
invented an indicative name for this thread.

We haven't heard from LTO customers on this, so I'll throw in some
hypotheticals...
Drives and tapes deteriorate over time, so some of what you're seeing
can be expected; but where problems are frequent, something's amiss.

Cartridge Memory chips, as in LTO, are written by close-proximity
radio frequency.  Being electronic and non-contact, that should last
"forever" and not yield the problems you are seeing...except where
the chips were of dubious quality as manufactured, or where they were
poorly secured in the cartridges and might work loose over time and
thus be out of position for the recording process.  If the latter
were true, I'd expect corresponding Read errors upon cartridge load;
but if a load operation is not getting that far (is a loose chip
physically interfering with that?), then who knows.  You might have
someone carefully take apart one of the hopeless, problem cartridges
and see what's going on inside.

Another factor is the old bugaboo, drive microcode.  That can change
the way operations occur.  The open, LTO standard is supposed to
guard against incompatibilities, so such things should not occur.
You might experiment with one of your drives, boosting it to the
latest microcode level (if not already there) and see if amelioration
occurs.  (Search on "drivecode level" in the IBM database to find
recent microcode changes. Keep in mind that changes might be made
without publicizing.)

Ultimately, you have to work with the drive vendor for relief, within
the terms of your contract.  Again, you have the benefit of the LTO
standard to insist upon compatibility across vendor drives and media.

   Richard Sims

On Jun 28, 2006, at 12:18 PM, Robin Sharpe wrote:

Hello TSMers,

I've been getting messages like the following on several drives and
for
several cartridges:

06/28/06 10:16:58 ANR8948S Device /dev/rmt/16m, volume 205975 has
issued
the following Critical TapeAlert: The tape just unloaded could not
write
its system area successfully: 1. Copy data to another tape
cartridge. 2.
Discard the old cartridge. (SESSION: 4514)
06/28/06 10:16:58 ANR8948S Device /dev/rmt/16m, volume 205975 has
issued
the following Critical TapeAlert: The operation has failed because the
media cannot be loaded and threaded. 1. Remove the cartridge,
inspect it as
specified in the product manual, and retry the operation. 2. If the
problem
persists, call the tape drive supplier help line. (SESSION: 4514)

We have been having more and more drive and tape problems over the
last six
months or so, after  a couple years of solid performance.  I'm
trying to
figure out the root cause or causes.  I suspected drives, because I
also
saw write errors on specific drives... we swapped them out, and it
seemed
better for a while... now I don't see the write errors, but I see the
messages above.

Some clues... these seem to be occurring only on recently
purchased, new
tapes.  We did change vendors about a year ago to get better
pricing.  I
have to collect some stats on how many tapes have problems, but my
guess is
only a dozen or so among the 4000 tapes we purchased over the past
year
from this vendor.  The only other thing  I can think of is changing
the
library to autolabel.  Oh, and we upgraded from TSM 5.2 to 5.3 in
April
'06.

Is this a tape problem (bad batch or low quality), or a drive or
library
problem?

My environment:
TSM 5.3.2.0 on HP-UX 11i.  Big server (HP rp7410, 8 CPU, 12GB RAM,  HP
XP512 RAID5 disk)
Five TSM servers on this box -- one library manager, four data
handlers.
The errors above are in the library manager's activity log.
STK L700 library w/ six DLT7000 (not being used) and 14 IBM LTO2.
618
slots, library is usually full.  We have about 3000 tapes onsite.
Library is doing tape cleaning, one universal cleaning tape in the
library.
Sometimes I see Tapealerts saying the cleaning tape is not data
grade...
was TSM trying to write to it?   Should we let TSM do the cleaning
instead?

TIA
Robin Sharpe
Berlex Labs

<Prev in Thread] Current Thread [Next in Thread>