ADSM-L

Re: [ADSM-L] Linux & SAN Device Interruptions

2011-03-30 11:58:05
Subject: Re: [ADSM-L] Linux & SAN Device Interruptions
From: Robert Clark <robert.clark7 AT USBANK DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 30 Mar 2011 08:41:02 -0700
Hi Nick,

I don't claim to have slayed that particular dragon yet, but I have
uncovered some background info on the subject:

In our case we're running a mix of RHEL (RHES?) 4 & 5 (on Intel x86_64) on
the TSM servers and a few storage agents  (and moving to SLES at some
point).

The first item I've found is that the default setting for the in-kernel
(or would that be in-distro?)  Emulex driver defaults to running multiple
discovery threads when scanning for devices at boot time. (16 threads,
IIRC.) This pretty much guarantees that devices won't ever be discovered
in the same order. According to the Emulex doc, this can be changed with a
boot time setting to the old behaviour of one discovery thread. (Would
take longer.)

The second item is that CDL/EDL emulating 3584 doesn't look much like a
3584 at the WWNN level. On a 3584, each drive has a unique WWNN that
incorporates both the library serial number and information about where
the drive is in the library. (Control path drives have a second LUN for
communicating with the library.)  On CDL/EDL all virtual tape drives in a
given VTL show up as different LUN numbers on the virtual library's one
WWNN. (Not optimal for the default udev rules on RHEL?)

The third item is that RHEL ships with a udev rule already poplulated that
could be used to make the CDL/EDL tape drives persistent, by using
something other than boot-time-enumerated mt#/rmt# for the naming
convention. (Name the drive after some part of the WWNN & LUN number?)
Personally I like this approach better than the
try-to-update-it-after-the-fact-on-the-library-manager scripts we use now.

When IBM allowed copy-on-write to go into Linux, I wish they'd also have
donated cfgmgr. I don't think any distro would take ODM though, so a
ported cfgmgr would likely be useless.

[RC]



From:   Nick Laflamme <dplaflamme AT GMAIL DOT COM>
To:     ADSM-L AT VM.MARIST DOT EDU
Date:   03/29/2011 04:28 PM
Subject:        [ADSM-L] Linux & SAN Device Interruptions
Sent by:        "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



How are those of you who run TSM servers or storage agents on Linux on
Intel doing with disruptions with SAN-attached tape devices or the SAN
fabric itself?

In my current shop, we run TSM servers on AIX (and MVS, but that's another
story), but we have storage agents on AIX, Windows, and Red Hat Linux on
Intel. The Linux storage agents are relatively new; they were first
deployed about two years ago. AIX and Windows storage agents have been
there a bit longer, although I can't say how much longer; I, too, have
been there less than two years.

One problem that we've never been able to overcome with our Linux storage
agents has been that if a virtual tape library is rebooted or if the SAN
fabric gets massively unzoned (it happened about a month ago to us, sigh),
the Linux storage agents don't notice the return of the SAN-attached tape
devices until we reboot the Linux server. (We never had the Linux servers
zoned to real 3584s and real LTO tape devices; they've only ever been
zoned up to EMC Clariian Disk Libraries and then DataDomains with VTL
cards in them.) This has persisted across updates to LINtape, CDL code
levels, Data Domain code levels, and TSM storage agent levels. Needless to
say, the application teams are rather steamed with us about this.

We have at times had cases open simultaneously with EMC, Red Hat, and IBM,
to no avail.

If you have Linux TSM servers or storage agents that gracefully recover
from disruptions on your tape SAN, can you share with me (and the rest of
the list, if you want) RHEL level, device driver levels, HBA
configuration, and whatever else you think might be relevant?

Thanks,
Nick

U.S. BANCORP made the following annotations
---------------------------------------------------------------------
Electronic Privacy Notice. This e-mail, and any attachments, contains 
information that is, or may be, covered by electronic communications privacy 
laws, and is also confidential and proprietary in nature. If you are not the 
intended recipient, please be advised that you are legally prohibited from 
retaining, using, copying, distributing, or otherwise disclosing this 
information in any manner. Instead, please reply to the sender that you have 
received this communication in error, and then immediately delete it. Thank you 
in advance for your cooperation.



---------------------------------------------------------------------

<Prev in Thread] Current Thread [Next in Thread>