ADSM-L

Re: 3584 Tape Library issue (maybe)

2005-01-19 07:01:33
Subject: Re: 3584 Tape Library issue (maybe)
From: "Lepre, James" <JLEPRE AT NECA DOT ORG>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 19 Jan 2005 07:00:50 -0500
How do you have the drives setup inside of TSM... Do you autodetect or
do you put he serial numbers in.  If you use serial numbers then check
to see if the drives are in the right order

Lsdev -Cc tape

Then do a lscfg -v rmtX to get the serial number and then match it up
with the correct order of drives 

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Nathan Reiss
Sent: Tuesday, January 18, 2005 5:17 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: 3584 Tape Library issue (maybe)

I got a weird problem.  This is a shot in the dark, but I'm going to ask
all the kind-folks on the Adsm mailing list a weird question.  Maybe
someone out there will have some ideas.

Please bare in mind that we do have a PMR open with TSM support, a
service
call with IBM Hardware CE's, and with Brocade.

It appears that we've tracked the problem to the 3584's. but not
positive
about that yet.

About every 2-4 days we have to power cycle all the drives, or at least
a
good subset of them, in our 3584 libraries.  Both of the 3584's are four
frames each, and each has 27 LTO2 drives.   The tape SAN consists of two
Brocade M14's. Each drive is plugged into the M14 directly.  (Not into
another edge switch).  We thought that there might be a bad connection
between the two M14's, but we disabled the system boards that were
giving
us some issues in both last week.  Then today the issue happened again.

The TSM Server (library managers) that run the two libraries are both at
v5.2.4, on AIX 5.2 ML4.   There are about 15 TSM library clients that
talk
the respective library managers, as well as somewhere around 50 storage
agents as well.  All are current TSM levels.  We are at the latest
firmware
on all the drives and the libraries as well now.    We are at Atape
8.4.9.0
.

Today, when the problem was happening we also (using the 3584's web
interface, so TSM was not involved) wouldn't eject tapes from the
libraries
to empty slots from the tape drives in frame four on one of them.  It
told
me that there weren't any empty slots to put the tape into.  But I could
move the tape from that drive, to drive 1 in frame 1, and then it would
eject it to an empty slot like normal.  It did this with five tapes.

The two things we have done to temp. fix the issue has been:

1. Restarting the TSM library Managers.
2A. Power cycling the drives.  Sometimes just the two drives that are
the
control paths into the library,
2B.  and sometimes it appears to need every drive power cycled.

Since I was having trouble with ejecting tapes earlier and TSM was not
involved in that scenario, I am inclined to think that TSM really isn't
part of the root problem, but that it some how gets confused and needs
to
be restarted at times in order to, shall we say,  clear its head.
Because
it seems to affect the library even when not talking to TSM or AIX, I
don't
think upgrading the Atape driver to whatever 9.X.X.X version is out
there
would fix the problem. But I'm open to arguments that say I'm full of it
there.

Does anybody out there have any ideas?

Thank you,

David N. Reiss
Unix/TSM System Engineer
Caterpillar, Inc.
(309)/494-3749

<Prev in Thread] Current Thread [Next in Thread>