ADSM-L

TSM inventory of empty slots becomes inconsistent with IBM3584 li brary.

2006-11-24 12:50:16
Subject: TSM inventory of empty slots becomes inconsistent with IBM3584 li brary.
From: "Schneider, John" <schnjd AT STLO.MERCY DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 24 Nov 2006 11:49:30 -0600
Greetings,
        We are running TSM 5.3.2.0 under AIX 5.3ML1.  We have 6 TSM
instances on two AIX hosts, sharing an IBM3584 library with 14 3592 drives,
using TSM tape library sharing.   This is an environment I inheritted about
3 months ago, so I don't know if my problem has existed in the past, or if
it has been inconspicuous until now.
        Recently we have begun to have a problem where TSM is checking in
scratch tapes coming back from offsite, and putting them in slots where
other tapes belong.  In other words, a tape gets mounted in a drive, and
therefore it's slot becomes temporarily empty.  Later, TSM checks in scratch
tapes, and puts some of them into those slots that are temporarily empty.
At least, this is what I think is happening.  This is open under PMR # 42313
Br, by the way.  The result is:

11/22/06 17:23:39     ANR8300E I/O error on library IBM3584 (OP=00006C03,

                       CC=315, KEY=05, ASC=3B, ASCQ=0D,
SENSE=70.00.05.00.00.00
                       .00.0A.00.00.00.00.3B.0D.00.C0.00.06.,
Description=The
                       destination slot or drive was full in an attempt to
move
                       a volume).  Refer to Appendix D in the 'Messages'
manual
                       for recommended action. (SESSION: 60475)

11/22/06 17:23:39     ANR8469E Dismount of 3592 volume 040031 from drive

                       DRIVE257 (/dev/rmt14) in library IBM3584 failed.

                       (SESSION: 60475)

11/22/06 17:23:44     ANR8358E Audit operation is required for library
IBM3584.
                       (SESSION: 60606)


So the dismount fails.  I don't know what the tape library does with the
tape at that point, but apparently puts them in some other slot, and forgets
where it is.
This happens on several tapes over the course of time, and then we start
getting:

11/22/06 18:05:46     ANR8356E Incorrect volume 040478 was mounted instead
of
                       volume 020513 in library IBM3584. (SESSION: 60755)

11/22/06 18:06:10     ANR8381E 3592 volume 020513 could not be mounted in
drive
                       DRIVE263 (/dev/rmt7). (SESSION: 60755)

11/22/06 18:06:10     ANR9790W Request to mount volume 020513 for library
client
                       MDCTSM01 failed. (SESSION: 60755)


When we first started to see this problem we were at library microcode level
5500 (very old), so we have subsequently upgraded to microcode 6480, which
IBM recommended.  It has not fixed the problem.  IBM is running out of
suggestions, I think.

We can successfully do an Audit Library (after dismounting all the tape
drives first), then check in (in search mode) any scratch and private tapes
that TSM's inventory didn't know about.  Then the inventory will be OK for a
few days, and then the problem comes back again.  It is particularly
annoying because when we do our daily checkout of offsite tapes, sometimes
TSM checks out the wrong tapes, I guess because they are in the wrong slots.
Then we have to check them back in again, and check out the tapes that were
supposed to check out to begin with.

One thing I thought of is deleting all the paths and the library completely,
and redefining the library from scratch.  My thinking is that maybe TSM's
map of element numbers somehow doesn't match what is really in the library,
or has become corrupted.  Redefining it would correct that.

Managing this problem is taking considerable time.   Any suggestions would
be appreciated.

Best Regards,

John D. Schneider
Sr. System Administrator - Storage
Sisters of Mercy Health System
3637 South Geyer Road
St. Louis, MO.  63127
Email:  schnjd AT stlo.mercy DOT net
Office: 314-364-3150, Cell:  314-486-2359

<Prev in Thread] Current Thread [Next in Thread>