ADSM-L

Re: seek problems on LTO2 tapes

2004-09-13 18:06:19
Subject: Re: seek problems on LTO2 tapes
From: Dave Canan <ddcanan AT ATTGLOBAL DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 13 Sep 2004 15:06:39 -0700
        Thanks Tim for passing this along. I have an additional question
for you. Are there any messages issued after a tape cartridge has been
corrupted but then remounted? Something like "directory being rebuilt?"
That would be great to know.
        Also, I have to do more research on this. I am not all that
familiar with TapeAlert. Do you need any additional software for this, or
do you just enable it with the SET TAPEALERT ON command? The ANR8950W
messages come out to the activity log, but are there more detailed messages
displayed anywhere? Thanks for this information.


At 04:15 PM 9/13/2004 -0500, you wrote:
We had this issue - one way to determine the corrupted tapes is if you have
TSM 5.2 or higher (or is it 5.2.2?) server with TapeAlert turned on.

We would then run checkin libv command with the checklabel=yes parameter.

TSM would then generate a message such as the following when it encountered
the bad tape:

ANR8950W Device x.x.x., volume LTXXXL2 has issued the following Warning
TapeAlert: The tape directory on the tape cartridge just unloaded has been
corrupted.  File search performance will be degraded.  The tape directory
can be rebuilt by reading all the data.

Granted it takes a bit of time to checkin all of your tapes reading the
labels but this was much better than a user doing a restore and encountering
this issue (hours and hours and hours to restore a 10K file!).

Tim Rushforth
City of Winnipeg

-----Original Message-----
From: Dave Canan [mailto:ddcanan AT ATTGLOBAL DOT NET]
Sent: Monday, September 13, 2004 3:41 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: seek problems on LTO2 tapes

I would like to post this to the listserv. This is based on a note I
recently to a customer who was experiencing LTO performance issues:

Recently, an issue was discovered with LTO1 and LTO2 firmware where the CM
index could be corrupted on an LTO tape. The CM index is a chip on the tape
itself that contains an index (or table of contents) to the location of the
files that have been written to the tape. The LTO architecture is designed
to automatically re-build this index if it should become corrupted.
However, when this corrupted index condition is detected, slow performance
is the result as the index is re-built, as the tape must be re-read from
the beginning to the end of the tape. We have seen cases where an index has
been corrupted, fixed the next time it is used, only to be corrupted again
at a future time. The corruption only occurs in certain circumstances, and
does not happen every time a tape is used.

It is important to note here that this is not a loss of data situation,
only a performance issue. Customers reporting the problem see it usually as
a result of a TSM performance problem when reading or copying data from 1
tape to another. One indication that a tape is suspect is when slow
reclamation or backup storage pool processes are observed.

There is a 2 part process in order to fix this problem. First, the firmware
for the LTO drives needs to be upgraded to the latest level (IBM is
currently recommending at least 4770). This prevents the tape CM indexes
from being corrupted again. Second, the tapes that have corrupted CM
indexes must be re-built. A customer can choose to either have the tapes
rebuild the indexes automatically when next re-read, or they can use a
utility provided as part of the LTO drivers (named tapeutil) to re-read the
tape and rebuild the index.

The basic process to use this utility (I will go through the detailed
process with you over the phone) is as follows:

1. Start the tapeutil utility. With an NT TSM server, the utility is named
ntutil.exe
2. Select a tape drive device to use.
3. Move a tape to that drive.
4. Open the tape as read-only.
5. Use the "Space to End of Data" option to re-read the tape to the end and
rebuild the index.
6. Dismount the tape
7. Start with the next tape.

It is hard to say how long a tape will take, because the more data that is
on the tape, the longer it will take to rebuild the index. Also, the number
of files on a tape is a factor as well. A tape having many small files will
take longer than a tape with a few larger files.


This outlines the issue. First, we must upgrade the firmware. Second, we
must fix the tapes that may have a corrupted index. It is hard to know
which tapes may be effected ( although you know it when you see it). You
can let it take its course and just let the tapes fix themselves over time,
or you can use the utilities above to fix the tapes. This is your choice.


At 03:34 PM 9/13/2004 +0200, you wrote:
>Hi Bill!
>You say you are running the latest firmware on the drives, did you see that
>there is a new version released on September 7th.?
>You can find it at ftp://ftp.storsys.ibm.com/358x/3583/4772L2S.zip
>I don't know what has been fixed in this release however. For some reason
>IBM always delivers a fixlist file with device drivers (like Atape)
>releases, but not for firmware releases...
>Kindest regards,
>Eric van Loon
>KLM Royal Dutch Airlines
>
>-----Original Message-----
>From: Bill Boyer [mailto:bill.boyer AT VERIZON DOT NET]
>Sent: Monday, September 13, 2004 15:21
>To: ADSM-L AT VM.MARIST DOT EDU
>Subject: seek problems on LTO2 tapes
>
>
>A continuation on a problem I posted earlier..we're not up, but it appears
>that when TSM issues the seek to a location on the tape, It's now taking
>almost 30-minutes for that to happen. In the mean time, the library/channel
>seems to hang.
>
>Question is....if the library can't do the seek, will it rewind the tape
and
>read the data until the point of the seek? This is becomming very painful
>and has only started happening last friday. Right now we have a complete
new
>Windows2003 server, latest Ultrium drivers, TSM 5.2.3.0, a new 3583 library
>and drives..latest firmware on library and drives.. Everything except the
>tapes is completely new. Is there anything on the tape that would cause
seek
>issues?
>
>Bill Boyer
>"An Optimist is just a pessimist with no job experience."  - Scott Adams
>
>
>**********************************************************************
>For information, services and offers, please visit our web site:
>http://www.klm.com. This e-mail and any attachment may contain
>confidential and privileged material intended for the addressee only. If
>you are not the addressee, you are notified that no part of the e-mail or
>any attachment may be disclosed, copied or distributed, and that any other
>action related to this e-mail or attachment is strictly prohibited, and
>may be unlawful. If you have received this e-mail by error, please notify
>the sender immediately by return e-mail, and delete this message.
>Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its
>employees shall not be liable for the incorrect or incomplete transmission
>of this e-mail or any attachments, nor responsible for any delay in
receipt.
>**********************************************************************

Dave Canan
TSM Performance
IBM Advanced Technical Support
ddcanan AT us.ibm DOT com

Dave Canan
TSM Performance
IBM Advanced Technical Support
ddcanan AT us.ibm DOT com

<Prev in Thread] Current Thread [Next in Thread>