Networker

Re: [Networker] SCSI bus resets????!!!

2005-12-23 11:21:54
Subject: Re: [Networker] SCSI bus resets????!!!
From: Carl Bergmann <carl.bergmann AT RISOE DOT DK>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 23 Dec 2005 17:06:17 +0100
Hello George
I'v struggled with scsi bus resets for months on my PowerEdge 2850 with
Adaptec scsi-adapter. Open cases with redhat and dell gold support did
not give me any solution. After upgrading my Redhat ES v.3 linux to
 2.4.21-32.0.1.ELsmp I downloaded adaptec source driver vers. 6.3.9 and 
compiled into the kernel all my problems went avay. I'v now been running
for
half a year and have not seen one scsi-bus reset since.
Hope this gives you a hint.
Carl Bergmann
Risoe Nat. Lab, DK



-----Original Message-----
From: Legato NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU]
On Behalf Of George Sinclair
Sent: Friday, December 23, 2005 4:38 PM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] SCSI bus resets????!!!

Hi,

First, if someone can suggest a better site that I could get help on 
this issue from, please let me know. LSI, Storagetek, Legato, and 
Quantum have been unable to resolve this thus far. Everyone seems to 
think the problem is not on their end.

Our Linux RedHat storage node host (Dell powerEdge 6600) generates SCSI 
bus reset messages in the system log almost daily (different times) for 
the library picker or more specifically for the scsi id being used by 
the robot. Is it possible that too many groups running could cause this 
or maybe something isn't configured properly in Legato? From the looks 
of the messages, his just doesn't appear to be a Legato NetWorker
problem.

I was labeling some tapes last night, nothing else going on, and 
suddenly it just hangs. Sure enough, another SCSI bus reset reported for

the picker. We've had Storagetek replace the robotic controller card in 
the library and the internal SCSI cable, and it's still occurring. We're

using LVD tape libraries (STL L80 w/ 4 LTO-1 drives) and P1000 SDLT w/2 
SDLT-1 drives).

We tried replacing cards, SCSI cables, moving cards to different PCI 
slots in the hosts, different servers, different drivers, etc. Still, 
same problem. We were having a similar problem with Adaptec 39160s so we

switched to LSI.

We're using LSI-22320-R dual channel 320 cards. I've seen this occur 
sometimes also on our Quantum library. Quantum says that Raid controller

cards can cause problems for pickers, but LSI says this card should work

fine. It has RAID 0 and 1 (striping and mirroring) capabilities, but it 
should otherwise function just like any LVD/SE card. We've not 
configured these cards to do any RAID stuff. There is no option, 
however, to specifically turn it off. There are various speeds that can 
be set on all the devices on each channel. The default (highest) is 320,

but you can go down to lower speeds, e.g. 160, 80, 40, 20, all the way 
down to 'ASYNC'. I've tried different speeds on the picker, down to 80 
so far. I would think the devices would auto negotiate, anyway. The 
picker is on its own channel at id 0. The drives are on their own 
channels at ids 2,3,4 and 5. Should the picker be at a lower speed, 
maybe even ASYNC? Maybe I should try changing the picker to a higher ID 
like 6 or 8?

Dec 23 15:01:31 santana kernel: scsi : aborting command due to timeout :

pid 78695, scsi2, channel 0, id 0, lun 0 Move medium/play audio(12) 00 
00 00 01 f5 04 1c 00 00 00 00
Dec 23 15:01:31 snode1 kernel: mptscsih: ioc0: id=0 OldAbort: scheduling

ABORT SCSI IO (sc=f6111200)
Dec 23 15:01:31 snode1 kernel: mptbase: Initiating ioc0 recovery
Dec 23 15:01:32 snode1 kernel: SCSI host 2 abort (pid 78695) timed out -

resetting
Dec 23 15:01:32 snode1 kernel: SCSI bus is being reset for host 2
channel 0.


Thanks.

George

To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>