IBM 3584 + LTO4 / LTO5 time out - Win2008R2

Paradox667

Active Newcomer
Joined
Nov 13, 2011
Messages
41
Reaction score
1
Points
0
Location
Adelaide
Hi all,

We are experiencing LTO4 timeouts in our windows 2008 R2 server environment using IBM driver IBMTAPEx64_w08_6231 (and 6226) that causes windows to drop the HBA port or for TSM to drop the connectivity to the library.

We have been through a range of tests and can pinpoint this to whenever we zone in the LTO4 drives.
If we use only LTO5 there is no timeout or problem.

Symptoms of this are that a TSMDLST is very slow to poll (incremental slower with increased number of drives) and a windows boot increases in time upwards of 30 minutes, then eventually the HBA will drop the port and log to the event log.

To troubleshoot further we attempted to rollback to the previous driver IBMTAPEx64_6217 which does not install correctly on windows 2008 R2

We have a hardware (and software) case open currently for investigation but I am curious if anyone else has experienced this issue before / recently and have any suggestions?

As we are in the process of migrating from TSM 5.5 to 6.3 on new hardware with new LTO5 drives, it's not an ideal situation to be faced with (Due to the size and complexity of our environment we need to maintain the existing LTO4s until migration is complete)

The hardware we are running on is as follows:
IBM 3584 firmware: B570
20 x LTO5 drive firmware: C7R2
28 x LTO4 drive firmware: A23D

Attached to HP proliant DL380 Gen8's with 16gb Emulex sn1000e dual port cards. (FW: 1.0.11.110 STORPORT DRIVER: 7.2.70.019)

A rough environment snapshot:
1. 1 x Library master + configuration master
2. 5 x library clients (DB ranging from 10gb - 445gb yes you read that right, yes thats scary, yes we want away asap)
3. Fibre connected to IBM 3584 + LTO4 drives only
4. Combination of windows 2003 / 2008 R2 server hosts
5. Approx 500 - 600 nodes

Any help much appreciated - even if it's just a suggestion we haven't thought of :)
 
How do you connect the LTO4 drives back to the server? Is this through just one SAN switch wherein 2 tape drives = 1 HBA port?

As a test, have you tried direct cabling bypassing the SAN switch?
 
Moon-buddy we connect via two san switches (an odd / even fabric) and you are correct that 2 drives per HBA for a total of 20 drives presented to a single server.

We haven't at this point tried direct connection as our existing environment is using the same san switches flawlessly, but only on LTO4 drives and using a much older driver, 6207, IBM support have informed us that this driver likely will not support the LTO5 correctly which is why we looked at a newer driver on the new hardware.

I'll put forward the idea of doing a direct connect to one of the LTO4 drives and seeing if the problem still exists (expectation is that it would - but as another troubleshooting step it can't hurt anything but time to run the fibre).
 
A further update on this IBM advised to upgrade LTO4 firmware - which has now been upgraded.
The timeouts on the driver still occur but TSMDLST is responsive.

I'm hesitant to say that the firmware has resolved it as the HBA timeout is still happening and boot time is still approx 20 minutes.
(Without IBM driver approx 5)

Any other suggestions at this point, as I'm waiting on IBM support and have a major project to move forward.
 
Do you use Q Logic HBAs?

If I remember right, Q Logic HBAs has a utility that can adjust times on the HBAs. Check it out.
 
Just wanted to let everyone know that we have resolved the issue.
IBM found an interoperability issue between the LTO4 firmware (the old version we where using and the newest also) and windows driver 6231.

We have rolled to microcode B710 for the LTO4 drives and the issue has been resolved.

many thanks for the assistance :)
 
Back
Top