Veritas-bu

[Veritas-bu] LTO drives not visible to a SSO host after a tape drive power cyc le

2002-10-11 17:34:53
Subject: [Veritas-bu] LTO drives not visible to a SSO host after a tape drive power cyc le
From: Brian.Boone AT telus DOT com (Brian Boone-TM)
Date: Fri, 11 Oct 2002 14:34:53 -0700
Not sure if anyone has seen this one, but it has been plaguing us for quite
some time.

In our SAN dedicated to tape backups we have our SSO host HBAs on a brocade
12000 domain.  This is ISLd to a Brocade 6400 where we have our FC-AL IBM
3580 LTO drives.  

What we have observed, is that after a drive is power cycled (during
maintenance, firmware upgrades, library reboot) Solaris can no longer see
the LTO drive(s).  If NetBackup touches these devices the Media Management
processes hang until the device comes back.  

Brocade has identified this as their problem. 

After looking at the log files, trace files and the details, Brocade has
determined that there is a defect within the v4.0.0x code stream which
matches a newly discovered known issue.

This issue is related to the way in which frames are handled as a result of
the Brocade Frame Filtering technology.  In essence, the scenario is as
follows:

A host performs ADISC (discovery) to the target before doing a PLOGI (Port
Login). In your scenario, the target(tape drive) is responding to the ADISC
with an unsolicited LOGO(different OXID)(This is the correct and expected
behaviour). The 12000 switch using version 4.0.0x is not handling these
LOGOs appropriately. Basically, it is dropping them. This results in the
host retrying with ADISC again. This goes on forever causing the hosts to
lose the devices.  The fix is to not drop these LOGOs and ensure they are
forwarded to the host correctly.

Brocade has identified the problem and have created a fix for this defect.
The current patch fix is found in version v4.0.2_rc1.6.  This will be rolled
into v4.0.2a which is in the process of being released.  The date for
v4.0.2a is forthcoming.

In the interim, they are creating a v4.0.0x code stream fix.  The method
that we have had success with in restoring visibility is to
portdisable/portenable on the port that the HBA is using.  This USUALLY
brings back all of fubard targets.

Hope this helps somebody.  
 
Brian Boone
Storage Area Network Specialist
Systems Operations
TELUS Mobility
Brian.Boone AT telus DOT com
 

<Prev in Thread] Current Thread [Next in Thread>