Networker

Re: [Networker] SCSI problems -- How many drives to a bus?

2004-01-09 15:01:27
Subject: Re: [Networker] SCSI problems -- How many drives to a bus?
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 9 Jan 2004 15:02:14 -0500
That's exactly what's happening. At some point the picker is no longer
seen by the software. It seems random. We might go several days before
seeing it again. Get these "aic7xxx_abort returns 0x2002" messages. This
doesn't always cause a faliure, though, but when the failure occurs, you
can bet that there was one of these SCSI resets that occured prior or
somewhere around that time seen in /var/log/messages.

How do we determine if the arm has the highest priority? How do we give
it the highest priority?

We had it daisy chained to drives 1 and 2 which were using channel A on
the Adaptec dual channel card and drives 3 and 4 were daisy chained and
used channel B. I since put the arm on its own bus so it now connects to
channel A on its own card and the drives still used A and B on the other
card as before. I terminated where the picker was daisy chained on back
of library. Here's current output from inquire:

[email protected]:SEAGATE ULTRIUM06242-XXX1522|Tape, /dev/nst0
[email protected]:SEAGATE ULTRIUM06242-XXX1522|Tape, /dev/nst1
[email protected]:SEAGATE ULTRIUM06242-XXX1522|Tape, /dev/nst2
[email protected]:SEAGATE ULTRIUM06242-XXX1522|Tape, /dev/nst3
[email protected]:ATL     P1000    62200502.23|Autochanger (Jukebox),
/dev/sg4
[email protected]:QUANTUM SuperDLT1       2323|Tape, /dev/nst4
[email protected]:QUANTUM SuperDLT1       2323|Tape, /dev/nst5
[email protected]:STK     L80             0212|Autochanger (Jukebox),
/dev/sg7
[email protected]:MegaRAIDLD 0 RAID1  139G1.92|Disk, /dev/sg8
[email protected]:PE/PV   1x8 SCSI BP     1.1 |Processor, /dev/sg9
[email protected]:HL-DT-STRW/DVD GCC-4240ND110|CD-ROM, /dev/sgk

George



plangfor AT ab.bluecross DOT ca wrote:
>
> We have a storagetek L80 on a unix box, (actually - 2 L80s, each on a
> different host) but each drive has it's own scsi port, with the arm sharing
> the connection with one drive.
>
> The thing to watch out for, is the arm must have the highest priority, or
> when the drives are busy the picker may not be seen by the software.
>
> Hope this helps
>
> Paul Langford
>
> -----Original Message-----
> From: George Sinclair [mailto:George.Sinclair AT NOAA DOT GOV]
> Sent: Friday, January 09, 2004 10:36 AM
> To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> Subject: [Networker] SCSI problems -- How many drives to a bus?
>
> Hi,
>
> We have a Storagetek L80 tape library with 4 LTO drives. We've been
> seeing a lot of SCSI problems on the host. Host is a storage node
> running RedHat Linux. I end up rebooting this host about once a week
> because the /etc/LGTOuscsi/inquire utility fails to see the picker
> device. This is really annoying. We finally moved the storage node to
> another, more powerful Linux box with more buses, etc. Same problem
> there!!! The first clue is the "read open error, Device or resource
> busy" message that appears next to the affected device in the devices
> section of the nwadmin window. Often, a backup will be running when the
> host loses communication to the picker.
>
> We have the robot on its own separate bus and all 4 drives share a bus.
> Max sessions per device is set to 5. We're running 6.1.1 under Solaris
> primary server. Should also note that we do have an ATL SDLT tape
> library running on there, too. Its picker, and two drives all share same
> bus, but this bus is its boss and does not share anything with the L80.
> So, we have three Adpactec cards: one for ATL, one for L80 picker and
> one for L80 LTO drives (dual channel Adapctec cards).
>
> I'm wondering if we have too many LTO drives on the bus? Could this
> cause these SCSI problems? Maybe better to have no more than two drives
> per bus? Someone suggested that we get the picker on its own bus which
> we recently did but that didn't fix it. I'm beginning to think that
> there's something wrong with the Storage Tek library and maybe it's time
> to have Storage Tek come look at it. Maybe we should get a temp license
> for another storage node and move the ATL over there so we only have one
> library on this host? Guess it would be easier to troubleshoot, but
> seems silly to have to do that. NO reason we should not be able to run
> two libraries, and the thing is is that the ATL libary never gives us
> any problems. I never see these "read open error ..." messages on there.
> Hmm ....
>
> Any thoughts?
>
> Thanks.
>
> George
>
> --
> Note: To sign off this list, send a "signoff networker" command via email
> to listserv AT listmail.temple DOT edu or visit the list's Web site at
> http://listmail.temple.edu/archives/networker.html where you can
> also view and post messages to the list.
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> PLEASE NOTE:     This communication, including any attached documentation,
> is intended only for the person or entity to which it is addressed, and may
> contain confidential, personal, and/or privileged information.   Any
> unauthorized disclosure, copying, or taking action on the contents is
> strictly prohibited.  If you have received this message in error, please
> contact us immediately so we may correct our records.   Please then delete
> or destroy the original transmission and any subsequent reply.   Thank you.

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=