ADSM-L

Re: [ADSM-L] TS3500 PROBLEM

2010-06-04 12:08:57
Subject: Re: [ADSM-L] TS3500 PROBLEM
From: "Prather, Wanda" <wPrather AT ICFI DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 4 Jun 2010 11:05:58 -0500
You don't say what type of drives; but check that when the library firmware was 
upgraded, the drive firmware was upgraded to a compatible level.  Rare, but 
I've seen cases where mismatched library/drive firmware will cause weird 
behavior.


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
John D. Schneider
Sent: Friday, June 04, 2010 12:01 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TS3500 PROBLEM

You don't say of ALL tape drives are refusing to mount, or if only
certain ones are experiencing the problem.

The "Reservation conflict" is the interesting part.  Rick talked about
deleting all the devices, and recreating everything, but it may just
turn around and happen again if you don't get to the root of the
Reservation conflict problem.

In my main support environment, we have many 14 TSM instances and
Lan-free servers all sharing the same libraries.  Sometimes we have had
a similar problem because we have shut down TSM instances while they had
tapes mounted.  In an ideal world, you should stop all tape processes
and dismount all tape drives that are mounted by the Library Clients
before shutting any of them down.  But there are times when some sort of
problem arises and you just have to restart a TSM instance or Lan-free
agent, and can't afford the proper procedure.

When this happens, I have found that the TSM Library Clients or Lan-free
agent sometimes looses track of what tape mounts it had before it was
shut down, so when it comes back up, it does not go through the normal
dismount process.  The normal dismount process will clear the SCSI
Reserve that the Library Client put on the tape drive to prevent other
servers from using it.  So although the TSM Library Master and Client
agree to that the Client is done with the drive, when the Library Master
goes to perform a mount on it the next time, it can't open it because
the SCSI Reserve is still set on the drive.

There are two ways to clear it:

1) Power-cycle the tape drive, which will clear the SCSI reserve.  (This
might cause the drive to disappear from the SAN long enough to cause TSM
instances to loose track of it, so you might have to rediscover it (or
run cfgmgr).)  

2) Log on to the Library Client or Lan-free agent that failed to
dismount it properly, and clear the SCSI Reserve.  You will have to go
through the TSM activity logs, but the log of the TSM Library Master
will usually tell you what Library Client it was talking to when trying
to complete the dismount of the drive.
   Once you know the Library Client that failed to dismount it properly,
you can use the "tapeutil" utility (in AIX) or "ntutil" utility (in
Windows) to clear the SCSI Reserve.  In "tapeutil", you first select the
option to Open the device, and enter the device name.  If it opens
correctly, you select the Option to clear the Reserve.  Then you close
the device.  On Windows, you just Open the device, then Close it; there
is no option to clear the Reserve.  Apparently the Close takes care of
it.  
    If you try to Open the device on a Library Client and the Open
fails, it probably means you have selected the wrong Library Client, and
it is being stopped by the SCSI Reserve set by a different Client.  

This has solved Reserve conflict problems for us in the past, and it is
far easier than deleting and recreating all the tape drives everywhere.




Best Regards,

John D. Schneider
The Computer Coaching Community, LLC
Office: (314) 635-5424 / Toll Free: (866) 796-9226
Cell: (314) 750-8721



-------- Original Message --------
Subject: Re: [ADSM-L] TS3500 PROBLEM
From: Richard Rhodes <rrhodes AT FIRSTENERGYCORP DOT COM>
Date: Fri, June 04, 2010 7:23 am
To: ADSM-L AT VM.MARIST DOT EDU

We've had some similar type problems, althought not that bad!

At times it's as through someone (aix, san, lib/drives) looses track of
things and starts fighting itself. It showed up as mount failures,
reserve
conficts, and lib manager hanging. We never did figure out what happened
or why, but I believe it was AIX/TSM getting confused. The "solution"
that seemed to "work" was to completely dismantle the tape subsystem and
recreate it. By "dismantle" I mean completely blow it away: tsm
drives/paths, aix rmt/smc devices, fscsi/fcs adapters. And, not just
drop
them to a defined state - delete them (clean out the ODM). Then, cfgmgr
it all back in, set your atape multi-pathing, define new drives/paths
toTSM. After doing this the problem finally went away. We've done this
enought that I put together some scripts to generate the TSM commands
for
drive/path deletion and creation.



Rick



I would agree with this, but even further. We've had several instances
where problem where the "fix" that seemed to workwas to complete delete
the




 Nick Laflamme
 <dplaflamme@GMAIL
 .COM> To
 Sent by: "ADSM: ADSM-L AT VM.MARIST DOT EDU
 Dist Stor cc
 Manager"
 <[email protected] Subject
 .EDU> Re: TS3500 PROBLEM


 06/04/2010 12:15
 AM


 Please respond to
 "ADSM: Dist Stor
 Manager"
 <[email protected]
 .EDU>






Have you tried to verify that the devices your paths point to are still
the
same? Or, better yet, deleted all your paths from all the library
clients
and regenerated them from scratch?

We haven't run with a real 3584 in a while, but whenever I get weird
errors
with library managers and shared libraries, that's where I go first. You
probably have, but you haven't said so, only that you've worked with the
SANDISCOVERY settings.

Just a thought,
Nick

On Jun 3, 2010, at 11:03 PM, Fred Johanson wrote:

> About 6 weeks ago, our hardware guy upgraded the code on the TS3500 and
ATape to the latest levels and made some hardware upgrades (details on
request). Within days we began to have assorted tape mount problems.
Supports initial response was to upgrade the TSM level to 5.5.4.2 to
avoid
a known problem with SANDISCOVERY. So we upgrade to the latest V5R5
level,
but we still see problems. So we turn off SANDISCOVERY, and things get
quiet; the telltale AIX message "RESERVATION CONFLICT". Support asks us
to
turn on SANDISCOVERY on various Library clients, with no effect until
last
Friday, when the Library Manager goes crazy. So turn off SANDISCOVERY on
the LM and all goes quiet.
>
> Yesterday the CE upgraded the TS3500 to the very latest, and within
minutes the Library begins refusing to mount tapes, with total disregard
to
the presence or absence of SANDISCOVERY and potentially disastrous
effect
on LANFREE backups. As I see it, from my TSM seat, the common thread
here
is the AIX message of "Reservation Conflict", which points to the
hardware
changes made.
>
> So after hours of looking at logs and mount messages and traces, which
has left me groggy, the question is "Is anyone out there seen any
difficulty with the software combination of latest version of AIX 5, TSM
5.5, and the TS3500. Jeremiah, that's me, has been saying for weeks that
the problem lies somewhere in the combination of hba, switch, port, and
whatever, but management always blames TSM.
>
> Pardon my incoherence, but I've been reading logs, etc., for the lastin
15 hours.


-----------------------------------------
The information contained in this message is intended only for the
personal and confidential use of the recipient(s) named above. If
the reader of this message is not the intended recipient or an
agent responsible for delivering it to the intended recipient, you
are hereby notified that you have received this document in error
and that any review, dissemination, distribution, or copying of
this message is strictly prohibited. If you have received this
communication in error, please notify us immediately, and delete
the original message.

<Prev in Thread] Current Thread [Next in Thread>