Veritas-bu

Re: [Veritas-bu] scsi_pkt.us_reason = 3

2007-09-14 11:33:01
Subject: Re: [Veritas-bu] scsi_pkt.us_reason = 3
From: Andrew Sydelko <andrew AT sydelko DOT org>
To: veritas-bu AT mailman.eng.auburn DOT edu
Date: Fri, 14 Sep 2007 11:15:13 -0400
On Fri, 14 Sep 2007 17:21:56 +0400
Vladimir Taleiko <taleiko AT jet.msk DOT su> wrote:

> Hi, all
> 
> I have Solaris 9 server with latest microcode/patches installed
> NetBackup 6.0 MP4
> L180 (IBM LTO2 - 5AT0)
> Brocade 3800 (v3.2.1b, dual fabric)
> 
> c4                             fc-fabric    connected    configured 
> unknown
> c4::500104f0006e4931,0         med-changer  connected    configured 
> unknown
> c4::5005076300615978,0         tape         connected    configured 
> unknown
> c4::500507630061605e,0         tape         connected    configured 
> unknown
> c5                             fc-fabric    connected    configured 
> unknown
> c5::5005076300615b41,0         tape         connected    configured 
> unknown
> c5::5005076300615d9b,0         tape         connected    configured 
> unknown
> c8                             fc-fabric    connected    configured 
> unknown
> c8::500507630061556a,0         tape         connected    configured 
> unknown
> c8::5005076300615aaa,0         tape         connected    configured 
> unknown
> c9                             fc-fabric    connected    configured 
> unknown
> c9::5005076300615937,0         tape         connected    configured 
> unknown
> c9::5005076300615ba3,0         tape         connected    configured 
> unknown
> 
> and the following errors:
> 
> Jul 28 03:13:04 ni5nrp2 bptm[26639]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 28 07:11:25 ni5nrp2 bptm[27999]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 28 07:13:32 ni5nrp2 bptm[27983]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 28 08:42:36 ni5nrp2 bptm[15693]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 29 09:21:34 ni5nrp2 bptm[18406]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 29 10:51:12 ni5nrp2 bptm[5955]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 30 00:42:07 ni5nrp2 bptm[27613]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 30 00:52:03 ni5nrp2 bptm[27614]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> Jul 30 01:11:55 ni5nrp2 bptm[27620]: [ID 832037 daemon.error] scsi 
> command failed, may be timeout, scsi_pkt.us_reason = 3
> 
> bptm logs doesn't show any issues.
> 
> I've opened two cases in Sun and Symantec.
> Sun guys said "There is no errors from scsi layer, call Symantec"
> Symantec support said "these errors are being received by the bptm 
> process, and not being generated by it, please involve SUN"
> 
> I'm slightly confused. What is the cause of this errors?

The time that we received these errors was on a ADIC Scalar 10K. After trying 
to get both ADIC and Symantec to work on the problem, ADIC finally came back 
and said the library is misconfigured. The option we ended up changing was 
"eject drive on move command". It was set to "software issues eject". Basically 
the drive occasionally didn't receive the eject command from the software and 
there was no retry. Now the library will issue an eject if the tape is still in 
the drive when asked to move the tape. It's worked fine ever since.

--andy.

--------------
Andrew Sydelko
Engineering Computer Network
Purdue University
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>