Veritas-bu

[Veritas-bu] Re: SCSI reset errors and downed drives

2001-09-05 13:30:15
Subject: [Veritas-bu] Re: SCSI reset errors and downed drives
From: anthony.guzzi AT storability DOT com (anthony.guzzi AT storability DOT com)
Date: Wed, 5 Sep 2001 13:30:15 -0400
I've got one phrase for you:     persistent binding

I'm wondering if your bridges are being "discovered"/recognized in a
different order then when NBU was installed.  If you are using a fabric,
then remember that for most OS's, unless you specify otherwise, the first
fibre device found by the OS will be assigned target 0 off the HBA, the
second one will get target 1, etc.  Keep in mind that there's no guarantee
the devices will be found in the same order each time.  And should a fibre
switch reboot, you run the risk (though very slim) that the targets may
change.  But with persistent binding, you'll be binding each fibre
device's world-wide name (WWN) to a specific SCSI target off the HBA.  The
way no matter what order the system sees the devices, they'll always get
the same SCSI target.

I recently had to work on an L-700 with 12 fibre-native STK 9840 tape
drives.  Each drive was connected directly to a Brocade switch as was the
master server.  Every time the server rebooted, we would get downed
drives.  This was a result of some of the tape drives being
'discovered'/recognized by the system in a different order then when NBU
was set up.  As such, they were being given different SCSI target numbers.
This 're-arrangement' of the tape drives really messed up NBU.  The end
result was the master server was instructing the robot to put a tape in
one drive and then accessing [via the SCSI target] another drive and as
would be expected failed to see the tape and so downed the drive.

Check your fibre HBA vendor's documentation for instructions on how to
enable persistent binding for the HBA's driver under HP-UX (the procedure
differs by vendor, driver, and OS).

-- Tony Guzzi
Sr. Solutions Engineer, AssuredRestore team
Storability, Inc.






To: veritas-bu AT mailman.eng.auburn DOT edu
Date: Wed, 05 Sep 2001 08:36:07 -0500
From: "dayal singh" <dayalsd AT lycos DOT com>
Reply-To: dayalsd AT lycos DOT com
Organization: Lycos Mail  (http://mail.lycos.com:80)
Subject: [Veritas-bu] SCSI reset errors and downed drives

NBU GURUs,
                          We are continuously experiencing SCSI reset
erros and the drives are being downed on some of the drives.  Most of the
time it happens on the specific drives, sometimes it affects other drives
also. I am running NBU DataCenter 3.4 on HP-UX 11.0, N-class machine and I
have twenty  Quantum DLT8000 drives, connected through fiber to a
SureStore L700 tape library over HP fiber-scsi bridges. The bridges have a
firmware of 4040.

Anyone has seen these errors, any fixes i.e patches etc  ?

Y'r resonse is greatly appreciated.

TIA

Dayal





<Prev in Thread] Current Thread [Next in Thread>