ADSM-L

Re: Media Fault

2003-06-30 12:38:39
Subject: Re: Media Fault
From: Roger Deschner <rogerd AT UIC DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 30 Jun 2003 11:38:17 -0500
I am having a pattern of this as well. I think that some tapes in my
library are being used much more heavily than others, and they are
simply wearing out.

I started out in December 2002 with all new SDLT tapes and a brand-new
library with new drives. Since then 3 of the new tapes have had this
"media fault" condition. As I am researching this, I am finding that
they are among the most heavily used tapes.

ITSM does not give us any decent statistics on this. Useage numbers are
kept for a "volume", not for a "libvol", which is a serious shortcoming.
This means that these useage and error statistics are lost each time a
tape is reclaimed and returned to the scratch pool.

What I am finding on these failed tapes, by scanning the activity log,
is that they are used and reclaimed very frequently. This is due to the
nature of the clients being backed up to that storage pool tree - they
are email servers, so there is a very high turnover of files on those
clients. Email-server clients would appear to wear out the tapes used to
back them up.

I'm also suspicious that my tape-to-tape reclamation causes less than
optimal tape movement, by interfering with streaming. My planned
addition of a reclamation disk storage pool may actually help extend
tape media life, by allowing the drive to stream on both input and
output.

I'm actually glad to hear this happens with LTO as well - I was
beginning to suspect my choice of SDLT over LTO was a mistake.

Roger Deschner      University of Illinois at Chicago     rogerd AT uic DOT edu


On Sun, 29 Jun 2003, Anwer Adil wrote:

>I am getting the following error:
>
>ANR8302E I/O error on drive DRIVE2 (mt2.0.0.5) (OP=LOCATE, Error Number=23,
>CC=0, KEY=03, ASC=14, ASCQ=00,
>SENSE=70.00.03.00.00.00.00.1C.00.00.00.00.14.00.06.00.20.80.00.00.00.00.00.00.
>-
>00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.
>-
>00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.
>-
>00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00,
>Description=An undetermined error has occurred).  Refer to Appendix D in
>the
>'Messages' manual for recommended action.
>ANR8359E Media fault detected on LTO volume 000024 in drive DRIVE2
>(mt2.0.0.5)
>of library IBM3583.
>
>Over the period of one month, tsm have reported media fault on 5 different
>tapes. Every week, a different tape is reporting error. IBM CEs checked the
>hardware to ensure that the tape drives and the library have the latest
>firmware. They also ran a diagnostic on the library and didn't find a
>problem. Tape drives are fine since the same tape reports error on
>different drives. I end up restoring the bad tape from the second copy
>every time I see this problem.
>
>I am using IBM3583 library with 3 lto drives. TSM version is 5.1.1.0.
>
>Please help as I am clueless at this point.
>
>Anwer Adil
>

<Prev in Thread] Current Thread [Next in Thread>