ADSM-L

Re: [ADSM-L] nightmares with a STK SL500 tape library

2011-04-05 20:42:46
Subject: Re: [ADSM-L] nightmares with a STK SL500 tape library
From: David Longo <David.Longo AT HEALTH-FIRST DOT ORG>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 5 Apr 2011 20:37:39 -0400
I've never working with this library, but have some questions/ideas.

1.  Are both libraries "identical" and do they have same firmware for library 
and drives?
How about configuration on the library?

2.  They are both connected to same TSM server you say.  Do they use the same 
HBA's
on the TSM server or different?  Do all HBA's have up to date firmware?

3.  What about SAN switches?  Use the same ones or different?  Also when 
library goes
offline, can you get the SAN guys to show or give you logs of the switches and 
see
if any glitch there?

4.  You say happens about once a month.  So I don't "assume", is this library 
in a Data Center
and on UPS power?  If so, then one thing that comes to mind is that Generator 
tests
happen once a month. Any correlation with any Data Center wide maintenance?

5.  Why are you concentrating on the Robot?  Did you get errors somewhere that 
point
to that?

6.  Do you use encryption on either or both libraries?  Do they use the same 
"Key Manager"?

7.  I wouldn't just replace this library because of problems.  May have same 
with new/same 
model if it is external!

Just a few..
David Longo

>>> "Dury, John C." <JDury AT DUQLIGHT DOT COM> 4/5/2011 6:10 PM >>>
We purchased an STK SL500 tape library with 4 LTO4 drives in it a few years ago 
and we have had nothing but problems with it, almost from the beginning. It is 
fully loaded with LTO4 cartridges (about 160) and seems to randomly just crash 
and take all of the drives offline to TSM. We also have a second SL500 that is 
at a remote site and connected to the same TSM server , and it has no problems 
at all. The remote SL500 has copies (backup stg pool) of the local SL500. We've 
gone round and round with STK/Oracle support and they have actually come onsite 
and physically replaced the entire robot and all of it's parts, several times 
and they can never find a reason as to what is causing it to go offline. Keep 
in mind this has been happening about once a month or so for over a year.

My questions to all of you is not so much what could be wrong (although if you 
have ideas, that would be great also), but, we are considering a new robot and 
are hoping to be able to use or reuse our existing LTO4 tapes. Right now it has 
about 80 scratches so if we were to goto a second library, I should be able to 
have both defined to TSM and move the data from one to the other after putting 
some of the scratches in the new library and labeling/initializing them until 
all data is in the new library and then I can light the old one on fire (j/k) !
Like most IT departments we are severely budget constrained so we would like to 
reuse the tape drives and the tape cartridges and only purchase a robot that 
can handle 160 slots or so. Suggestions if this is even an option or which 
robots and/or models to look at? Remember, very little budget for this if I 
could even get it approved at all but we really don't know what else to do with 
the bad SL500 at this point and we have a project coming up that is going to 
increase the amount and flow of data to our TSM system significantly within the 
new few years.
Help!
John



#####################################
This message is for the named person's use only.  It may 
contain private, proprietary, or legally privileged information.  
No privilege is waived or lost by any mistransmission.  If you 
receive this message in error, please immediately delete it and 
all copies of it from your system, destroy any hard copies of it, 
and notify the sender.  You must not, directly or indirectly, use, 
disclose, distribute, print, or copy any part of this message if you 
are not the intended recipient.  Health First reserves the right to 
monitor all e-mail communications through its networks.  Any views 
or opinions expressed in this message are solely those of the 
individual sender, except (1) where the message states such views 
or opinions are on behalf of a particular entity;  and (2) the sender 
is authorized by the entity to give such views or opinions.
#####################################