ADSM-L

Re: 3592 tape read performance

2006-07-30 10:45:40
Subject: Re: 3592 tape read performance
From: "Darby, Mark" <Mark.Darby AT HQ.DOE DOT GOV>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 30 Jul 2006 10:44:31 -0400
In addition to Richard Sims' consistently excellent and insightful
response, one should also ensure that the microcode/firmware levels are
up to date - but "if it ain't broke, don't fix it" also applies, as
well.

We have noticed significant behavioral differences between
microcode/firmware levels and have experienced (and resolved) problems
like you mention with microcode/firmware updates.

We have had several media appear to become completely unreadable (in 4-5
independent attempts on different drives) but which, following a
firmware/microcode update, exhibited no subsequent problem.

We have also experienced extreme performance degradation (dropping to
KB/sec rates) when a brand new 3592 drive (a maintenance replacement)
had to be immediately replaced (again) - and this problem was never
explained.


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Thomas Denier
Sent: Friday, July 28, 2006 4:15 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: 3592 tape read performance


We are seeing increasingly frequent problems reading data from 3592
tapes. TSM sometimes spends as much as a couple of hours reading a
single file with a size of a few hundred megabyes. In some cases,
TSM reports a hardware or media error at the end of that time. In
other cases TSM eventually reads the file successfully. In the
latter case there are, as far as we can tell, no error indications
at all: no TSM messages, nothing logged by the OS, and no indicators
on the front panel of the tape drive. In some case the same tape
volume suffers this type of problem repeatedly. The problems seem
to spread roughly evenly over our whole population of 3592 drives.

We have just removed one 3592 volume from service because of
recurrent read problems, and are about to remove a second volume
from service. We only have about 120 3592 volumes, and losing two
of them within a week is disturbing, to put it mildly. The
possiblity that the volumes with non-recurring (so far) problems
will eventually need replacement is even more disturbing.

Our TSM server is at 5.2.6.0, running under mainframe Linux. The
3592 tapes drives are all the J1A model.
Does anyone have any suggestions for getting to the bottom of this?

<Prev in Thread] Current Thread [Next in Thread>