Veritas-bu

[Veritas-bu] HP LTO3 FC drives "locking up" [NC]

2007-03-26 15:51:13
Subject: [Veritas-bu] HP LTO3 FC drives "locking up" [NC]
From: backupicici at gmail.com (Veritas Netbackup)
Date: Tue, 27 Mar 2007 01:21:13 +0530
We have already a firmware upgrade this Dec, all drives are on 58W now.

Every time a drive stops responding, I need to reboot the entire library.
Any problem - rebooting the library solves the problem most of the times.

Its seems like a internal communication issue with the library.

I have sent support tickets to HP, and asked them to give me long term
advice, but it seems the are happy resolving the problem with a reboot.

Regards,
BIJU

On 3/24/07, misha.pavlov at sgcib.com <misha.pavlov at sgcib.com> wrote:
>
> BIJU,
>
> > The robtest works and i'm able to move media in and out.
> If I understood you correctly and you are able to load / uload tapes, then
> it is a different scenario.
>
> Otherwise ...
> My drives lock up hard. They go offline, disappearing from the fabric and
> do not respond to the hard reset ( holding the front pannel button for
> more than 15 secs ).
>
> Check the version of the f/w you have an make sure to load the latest one
> ( I beleive it is currently L58S/009.822 )
>
> If you have direct support from HP, ask them when the new LTO3 microcode
> is scheduled for release and plan to upgrade. It may address this problem.
>
>
>
> --
> Misha Pavlov
> Soci?t? G?n?rale
> desk: (212) 278-6096
> cell: (646) 346-9341
>
> This message uses only 100% recycled electrons.
>
>
>
> "Veritas Netbackup" <backupicici at gmail.com>
> 03/09/2007 02:42 PM
>
>
> To
> Misha PAVLOV/us/socgen at socgen
> cc
> veritas-bu at mailman.eng.auburn.edu
> Subject
> Re: [Veritas-bu] HP LTO3 FC drives "locking up" [C1]
>
>
>
>
>
>
> Hi Misha,
>
> We have HP ESL Library and we face the same problem atleast once in a
> month.
>
> NBU 5.1 MP4
>
> HP ESL 712 E
>
> The drives go dizzy and fail to respond, the management software the LCD
> panel all show thumbs up. The robtest works and i'm able to move media in
> and out. "mt" just hangs and netbackup is not able to read the label. It
> seems like an internal comm'n issue in the library.
>
> Replacing the drives also does not help.  The library has to be rebooted
> to solve the problem. In my case even the switch port shows "Online".
>
> The HP support team world wide has never been able to crack the mystery.
>
> Regards,
> BIJU
>
> On 3/7/07, misha.pavlov at sgcib.com <misha.pavlov at sgcib.com > wrote:
> Folks,
>
> did anyone notice a problem with HP LTO3 drives in SSO configuration,
> running NBU 5.1 on Solaris 8 ?
>
> Once - twice a week I have bptm debug logs reporting in the middle
>
> 22:03:41.746 [19156] <2> send_brm_msg: MEDIA NOT READY
> 22:03:41.746 [19156] <2> write_data: attempting write error recovery, err
> = 5
> 22:03:41.746 [19156] <2> tape_error_rec: error recovery to block 1485323
> requested
> 22:03:41.746 [19156] <2> tape_error_rec: attempting error recovery, delay
> 3 minutes before next attempt, tries left = 5
> 22:06:41.739 [19156] <2> io_ioctl: command (0)MTWEOF 0 from
> (overwrite.c.488) on drive index 43
> 22:06:41.739 [19156] <2> io_ioctl: MTWEOF failed during error recovery,
> I/O error
> 22:08:40.745 [19156] <2> tape_error_rec: cannot read position for error
> recovery, scsi_determine_bt ret -1 CDB 0x12 SK 0x0 ASC 0x0 ASCQ 0x0
>
> and immediatly after in /var/adm/messages I see
>
> Mar  5 22:03:41 vepanyup03 scsi: [ID 107833 kern.warning] WARNING:
> /pci at 1d,700000/SUNW, emlxs at 2,1/fp at 0,0/st at w500104f0005ddda2,0 
> (st297):
> Mar  5 22:03:41 vepanyup03  SCSI transport failed: reason 'timeout':
> giving up
> Mar  5 22:07:40 vepanyup03 bptm[19156]: [ID 832037 daemon.error] scsi
> command failed, may be timeout, scsi_pkt.us_reason = 6
> Mar  5 22:07:47 vepanyup03 fctl: [ID 517869 kern.warning] WARNING:
> 3654=>fp(1)::GPN_ID for D_ID=150500 failed
> Mar  5 22:07:47 vepanyup03 fctl: [ID 517869 kern.warning] WARNING:
> 3655=>fp(1)::N_x Port with D_ID=150500, PWWN=500104f0005ddda2 disappeared
> from fabric
> Mar  5 22:08:40 vepanyup03 bptm[19156]: [ID 832037 daemon.error] scsi
> command failed, may be timeout, scsi_pkt.us_reason = 6
> Mar  5 22:09:01 vepanyup03 scsi: [ID 243001 kern.info]
> /pci at 1d,700000/SUNW,emlxs at 2,1/fp at 0,0 (fcp1):
> Mar  5 22:09:01 vepanyup03  offlining lun=0 (trace=0), target=150500
> (trace=2800004)
> Mar  5 22:11:40 vepanyup03 bptm[19156]: [ID 498531 daemon.error] user scsi
>
> ioctl() failed, may be timeout, errno = 5, I/O error
>
> Drives "lock" up and becomes iresponsive.
> Front panel light show no signs of problem with solid green light on.
> Pressing eject button does not do anything.
> Brocade 4100 switch port does not "see" the drive anymore and shows
> "In_Sync" instead of the "Online".
>
> The only way to bring the drive back is the powercycle.
> A minute or two after the powercycling I can eject the tape and see
> fctl: [ID 517869 kern.warning] WARNING: 3799=>fp(1)::N_x Port with
> D_ID=150500, PWWN=500104f0005ddda2 reappeared in fabric
> in /var/adm/messages
>
> Drives and library are at the latest f/w revision.
> SUN and STK are clueless, but still looking for the last week.
>
> --
> Misha Pavlov
> Soci?t? G?n?rale
> desk: (212) 278-6096
> cell: (646) 346-9341
>
> This message uses only 100% recycled electrons.
>
> *************************************************************************
> This message and any attachments (the "message") are confidential and
> intended solely for the addressees.
> Any unauthorised use or dissemination is prohibited.
> E-mails are susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates
> shall be liable for the message if altered, changed or falsified.
>
> *************************************************************************
>
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>
> *************************************************************************
> This message and any attachments (the "message") are confidential and
> intended solely for the addressees.
> Any unauthorised use or dissemination is prohibited.
> E-mails are susceptible to alteration.
> Neither SOCIETE GENERALE nor any of its subsidiaries or affiliates
> shall be liable for the message if altered, changed or falsified.
>
> *************************************************************************
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://mailman.eng.auburn.edu/pipermail/veritas-bu/attachments/20070327/b2c5c804/attachment.html