Veritas-bu

[Veritas-bu] HP LTO3 FC drives "locking up" [C1]

2007-03-06 16:32:21
Subject: [Veritas-bu] HP LTO3 FC drives "locking up" [C1]
From: pkeating at bank-banque-canada.ca (Paul Keating)
Date: Tue, 6 Mar 2007 16:32:21 -0500
If the switch does not see the drive, then it's not a driver/OS/SSO issue.

The drive, its optics, the Switch SFP, or something like that is faulty.

What if you just unplug the fiber and plug it in again? Do you see the drive 
login to the switch?
What about a different switch port?
Is it only this one drive? 
Did you try moving it to another switch port?
If that doesn't work, get STK to swap it out.

If it's multiple drives (all?) it could be the library controller.
The STK libraries intercept the drive to augment the WWN and AL-PAs, so it 
could be in the libray and not in the drive.

I've seen weird things happen with libraries. I've had a drive bite it when a 
tech put a LTO2 code load tape into an LTO3 drive.

Paul
-- 


> -----Original Message-----
> From: veritas-bu-bounces at mailman.eng.auburn.edu 
> [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] On Behalf 
> Of misha.pavlov at sgcib.com
> Sent: March 6, 2007 4:10 PM
> To: veritas-bu at mailman.eng.auburn.edu
> Subject: [Veritas-bu] HP LTO3 FC drives "locking up" [C1]
> 
> 
> Folks,
> 
> did anyone notice a problem with HP LTO3 drives in SSO configuration, 
> running NBU 5.1 on Solaris 8 ?
> 
> Once - twice a week I have bptm debug logs reporting in the middle 
> 
> 22:03:41.746 [19156] <2> send_brm_msg: MEDIA NOT READY
> 22:03:41.746 [19156] <2> write_data: attempting write error 
> recovery, err 
> = 5
> 22:03:41.746 [19156] <2> tape_error_rec: error recovery to 
> block 1485323 
> requested
> 22:03:41.746 [19156] <2> tape_error_rec: attempting error 
> recovery, delay 
> 3 minutes before next attempt, tries left = 5
> 22:06:41.739 [19156] <2> io_ioctl: command (0)MTWEOF 0 from 
> (overwrite.c.488) on drive index 43
> 22:06:41.739 [19156] <2> io_ioctl: MTWEOF failed during error 
> recovery, 
> I/O error
> 22:08:40.745 [19156] <2> tape_error_rec: cannot read position 
> for error 
> recovery, scsi_determine_bt ret -1 CDB 0x12 SK 0x0 ASC 0x0 ASCQ 0x0
> 
> and immediatly after in /var/adm/messages I see
> 
> Mar  5 22:03:41 vepanyup03 scsi: [ID 107833 kern.warning] WARNING: 
> /pci at 1d,700000/SUNW,emlxs at 2,1/fp at 0,0/st at w500104f0005ddda2,0 
> (st297):
> Mar  5 22:03:41 vepanyup03  SCSI transport failed: reason 'timeout': 
> giving up
> Mar  5 22:07:40 vepanyup03 bptm[19156]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 6
> Mar  5 22:07:47 vepanyup03 fctl: [ID 517869 kern.warning] WARNING: 
> 3654=>fp(1)::GPN_ID for D_ID=150500 failed
> Mar  5 22:07:47 vepanyup03 fctl: [ID 517869 kern.warning] WARNING: 
> 3655=>fp(1)::N_x Port with D_ID=150500, PWWN=500104f0005ddda2 
> disappeared 
> from fabric
> Mar  5 22:08:40 vepanyup03 bptm[19156]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 6
> Mar  5 22:09:01 vepanyup03 scsi: [ID 243001 kern.info] 
> /pci at 1d,700000/SUNW,emlxs at 2,1/fp at 0,0 (fcp1):
> Mar  5 22:09:01 vepanyup03  offlining lun=0 (trace=0), target=150500 
> (trace=2800004)
> Mar  5 22:11:40 vepanyup03 bptm[19156]: [ID 498531 
> daemon.error] user scsi 
> ioctl() failed, may be timeout, errno = 5, I/O error
> 
> Drives "lock" up and becomes iresponsive.
> Front panel light show no signs of problem with solid green light on.
> Pressing eject button does not do anything.
> Brocade 4100 switch port does not "see" the drive anymore and shows 
> "In_Sync" instead of the "Online".
> 
> The only way to bring the drive back is the powercycle.
> A minute or two after the powercycling I can eject the tape and see
> fctl: [ID 517869 kern.warning] WARNING: 3799=>fp(1)::N_x Port with 
> D_ID=150500, PWWN=500104f0005ddda2 reappeared in fabric
> in /var/adm/messages
> 
> Drives and library are at the latest f/w revision.
> SUN and STK are clueless, but still looking for the last week.
> 
> --
> Misha Pavlov
> Soci?t? G?n?rale
> desk: (212) 278-6096
> cell: (646) 346-9341
> 
> This message uses only 100% recycled electrons.
> 
> **************************************************************
> ***********
> This message and any attachments (the "message") are confidential and
> intended solely for the addressees.
> Any unauthorised use or dissemination is prohibited. 
> E-mails are susceptible to alteration.   
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates 
> shall be liable for the message if altered, changed or falsified. 
> 
> **************************************************************
> ***********
> 
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> 
====================================================================================

La version fran?aise suit le texte anglais.

------------------------------------------------------------------------------------

This email may contain privileged and/or confidential information, and the Bank 
of
Canada does not waive any related rights. Any distribution, use, or copying of 
this
email or the information it contains by other than the intended recipient is
unauthorized. If you received this email in error please delete it immediately 
from
your system and notify the sender promptly by email that you have done so. 

------------------------------------------------------------------------------------

Le pr?sent courriel peut contenir de l'information privil?gi?e ou 
confidentielle.
La Banque du Canada ne renonce pas aux droits qui s'y rapportent. Toute 
diffusion,
utilisation ou copie de ce courriel ou des renseignements qu'il contient par une
personne autre que le ou les destinataires d?sign?s est interdite. Si vous 
recevez
ce courriel par erreur, veuillez le supprimer imm?diatement et envoyer sans 
d?lai ?
l'exp?diteur un message ?lectronique pour l'aviser que vous avez ?limin? de 
votre
ordinateur toute copie du courriel re?u.