ADSM-L

Re: [ADSM-L] Looking for SAN/tape experts assistance

2010-10-04 09:03:27
Subject: Re: [ADSM-L] Looking for SAN/tape experts assistance
From: Zoltan Forray/AC/VCU <zforray AT VCU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 4 Oct 2010 09:02:27 -0400
Thanks for the info and examples.

However, I am at a loss to understand why I need this, now.  Especially
when 2-identical (well, I guess something is different ;--) servers are
acting differently.  I have never had to do this with *ANY* other of my
now 7-servers.  Unless hardware changed, a reboot would not change the
order of the tape drives.
Zoltan Forray
TSM Software & Hardware Administrator
Virginia Commonwealth University
UCC/Office of Technology Services
zforray AT vcu DOT edu - 804-828-4807
Don't be a phishing victim - VCU and other reputable organizations will
never use email to request that you reply with your password, social
security number or confidential personal information. For more details
visit http://infosecurity.vcu.edu/phishing.html



From:
"Sergio O. Fuentes" <sfuentes AT UMD DOT EDU>
To:
ADSM-L AT VM.MARIST DOT EDU
Date:
09/27/2010 01:43 PM
Subject:
Re: [ADSM-L] Looking for SAN/tape experts assistance
Sent by:
"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



I'm doing this work right now for a big project here.  My first endeavor
into Linux.



The lin_tape drivers for 6.2 will require a .rules file in
/etc/udev/rules.d (or wherever your udev stuff lives) mainly because of
the instance owner/group requirements to run 6.2 dsmserv processes. Unless
you can alter your default udev rules for EVERYTHING, then you'll need the
.rule file to assign ownership and mode parameters for the tape devices.



Mine, so far, looks like this:



#cat /etc/udev/rules.d/98-lin_tape.rules

KERNEL=="IBMchanger*",  SYSFS{primary_path}=="Primary",
SYSFS{serial_num}=="0000078150090402",  OWNER="tsminst1", MODE="0600",
SYMLINK="lin_tape/IBMchanger137B"

KERNEL=="IBMchanger*",  SYSFS{primary_path}=="Alternate",
SYSFS{serial_num}=="0000078150090402", OWNER="tsminst1", MODE="0600",
SYMLINK="lin_tape/IBMchanger138A"

KERNEL=="IBMtape*[0-9]",        SYSFS{ww_port_name}=="0x5005076300549127",
     OWNER="tsminst1",       MODE="0600", SYMLINK="lin_tape/IBMtape137"

KERNEL=="IBMtape*[0-9]",        SYSFS{ww_port_name}=="0x5005076300549128",
     OWNER="tsminst1",       MODE="0600", SYMLINK="lin_tape/IBMtape138"

KERNEL=="IBMtape*[0-9]",        SYSFS{ww_port_name}=="0x5005076300549129",
     OWNER="tsminst1",       MODE="0600", SYMLINK="lin_tape/IBMtape139"

KERNEL=="IBMtape*n",      OWNER="tsminst1",       MODE="0600"





There are a lot of gotchas with this method that I'm running into.  I'm
not sure if they are kernel bugs or driver issues but not much of this is
documented anywhere.  Bullet-list (so far):



-    If you have alternate pathing or data path failover, lin_taped needs
be installed and running.  Problem is getting persistent binding to work
with this.  There's a race condition where once modprobe lin_tape is run,
the udev files are created with the rules.  But the SYSFS{primary_path}
key isn't defined correctly until lin_taped is run, BUT lin_taped can't
run until lin_tape is loaded.  So by the time lin_taped is executed and
running, the lin_tape rules have already been processed for udev.

o    My workaround will be to create an init script that will run
lin_taped and then udevtrigger. Seems to work, but udevtrigger once
crashed the system.

-    Sometimes when lin_tape is loaded, the mode is incorrect for devices.
 The fix is again "udevtrigger".

-    KERNEL=="IBMtape*" doesn't work for renaming, because sometimes a
symlink to IBMtape1n is used instead of IBMtape1.  Which is why I have the
character class "IBMtape*[0-9]"



Here's the output for ls /dev/ commands for when I believe things are
configured correctly.  Caveat:  I haven't even tested reading/writing to
these devices yet, let alone defining the devices to TSM.



#ls -l /dev/IBMtape*

crw-r--r-- 1 root     root 250, 3071 Sep 27 11:43 /dev/IBMtape

crw------- 1 tsminst1 root 250,    0 Sep 27 11:43 /dev/IBMtape0

crw------- 1 tsminst1 root 250, 1024 Sep 27 11:43 /dev/IBMtape0n

crw------- 1 tsminst1 root 250,    1 Sep 27 11:43 /dev/IBMtape1

crw------- 1 tsminst1 root 250, 1025 Sep 27 11:43 /dev/IBMtape1n

crw------- 1 tsminst1 root 250,    2 Sep 27 11:43 /dev/IBMtape2

crw------- 1 tsminst1 root 250, 1026 Sep 27 11:43 /dev/IBMtape2n



#ls -l /dev/lin_tape

total 0

lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger137B -> ../IBMchanger0

lrwxrwxrwx 1 root root 14 Sep 27 11:43 IBMchanger138A -> ../IBMchanger1

lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape137 -> ../IBMtape2

lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape138 -> ../IBMtape0

lrwxrwxrwx 1 root root 11 Sep 27 11:43 IBMtape139 -> ../IBMtape1





HTH,



Sergio



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Zoltan Forray/AC/VCU
Sent: Wednesday, September 22, 2010 2:53 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] Looking for SAN/tape experts assistance



I have mentioned in previous posts that we are putting up 2-new RH Linux

based TSM server . These are the first of my existing 5-Linux servers to

use EMC SAN storage.



With every new adventure, we get new problems.  This one is driving

everyone crazy and hope someone out there can point us in the right

direction.



We have seen posts in ADSM-L that sorta talk about it, but nothing that

explains what is going on with us or how to resolve it.



Both new servers have been configured identically when it comes to the OS

(RedHat Linux 5.5  kernel 2.6.18-194.11.3.el5) software and other hardware

supporting software (EMC Powerpath and IBM lin_tape drivers - 1.41.1 for

the TS1120/1130 drives)



The problem is this.



Every time we reboot one of the new servers,  the values in

/proc/scsi/IBMtape is different in the assignment of /dev numbers to the

drives.  It seems to find the tape drives in a different order each time.

None of my 5-production nor the other new TSM server have this problem (I

have rebooted the 2nd new server 4-times and the /dev/IBMtape? values stay

the same).



When looking through the "fixlist" for lin_tape (usually

engineering-speak), we saw this interesting entry at the 1.37 level:



Removed persistent naming script in favor of new method





Questions come to mind about things like "what naming script"......."what

new method" ....   "could this possibly be related to what we are

experiencing"?



We have spent all day trying to figure this wrinkle out.  Any suggestions

are greatly appreciated.

Zoltan Forray

TSM Software & Hardware Administrator

Virginia Commonwealth University

UCC/Office of Technology Services

zforray AT vcu DOT edu - 804-828-4807

Don't be a phishing victim - VCU and other reputable organizations will

never use email to request that you reply with your password, social

security number or confidential personal information. For more details

visit http://infosecurity.vcu.edu/phishing.html