ADSM-L

Re: LTO 3583 drive problem.

2003-02-12 03:10:37
Subject: Re: LTO 3583 drive problem.
From: Jozef Zatko <zatko AT LOGIN DOT SK>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 12 Feb 2003 09:07:07 +0100
In my environment I have following:
IBM 3534 F08 FC switch, Atape driver 7.1.5.0, 2 SCSI drives with code 25D4,
AIX 5.1 ML 02, TSM 5.1.5.2, library 3583
with firmware 2.80.0410, SAN Data Gateway code 4.20.05.
Each device is on separate SCSI channel. Library and SAN Data GW do not
report any errors.
I will have to have closer look on SAN switch.

Humberto, can you provide more detailed info about your configuration?


Ing. Jozef Zatko
Login a.s.
Dlha 2, Stupava
tel.: (421) (2) 60252618



                    David Longo
                    <David.Longo@HEALTH-        To:     ADSM-L AT VM.MARIST DOT 
EDU
                    FIRST.ORG>                  cc:
                    Sent by: "ADSM: Dist        Subject:     Re: LTO 3583 drive 
problem.
                    Stor Manager"
                    <ADSM-L AT VM.MARIST DOT ED
                    U>


                    12.02.2003 05:17
                    Please respond to
                    "ADSM: Dist Stor
                    Manager"






I have seen some messages here recently about 3583 drive errors.
I have a 3584 with (8) Fibre drives going through McData switches.
with TSM Server 4.2.2.10 on AIX 4.3.3 ML 10 and Atape sdriver
7.1.1.0.  3584 library code 2460 and drive code 25D4.
Have had 3584 about a year.  Have had intermittant
errors maybe 2-3 times a month, but noticed that were only
happening on DRIVE1 which has control path.  Moved Control
path to DRIVE5 and sure enough the errors were then happening
on that drive.

Working with our IBM CE on this for several months, IBM Tucson
initially was saying that the errors we were getting were media errors.
But I could reuse tha same tape and it would have no more problems.
On some instances, the tape would not be dismounted from drive
but have to have drive reset for eject.

Finally in late December my IBM CE called and said that IBM Tucson
had seen this problem all over and it was happeing on the Fibre Drives, the
drive that the Control Path was on and going through either IBM
2109 or McData switches.  These were the common factors.  IBM is
working on a code fix (I assume for drives and/or library).

I suspect problem could be happening on 3583 also.

David Longo

>>> zatko AT LOGIN DOT SK 02/11/03 11:10AM >>>
Hi,
I have exactly the same problem except that I have TSM server on AIX 5.1.
It is very strange, because it happens randomly always during unmounting
the tape.
It is always the same drive - /dev/rmt0.
Tape remains in the drive. I can unload tape using operator panel without
any problem. But if I want to use tapeutil utility I get following error
when I try to open device /dev/rmt0

Operation failed with errno 79: Connection refused

I have to reset SAN Data Gateway and then the drive is again accesible with
tapeutil and also from TSM server.

Does somebody know, what does error message : Operation failed with errno
79: Connection refused
mean? Where I can find information about this error number. I was looking
in Atape driver doc but there is no mention about this message.
In output from errpt there is no error from fibre adapter nor from tape
drive.

Here is copy of Event log from SAN DAta Gateway:

Sequence        Time         Code    Description
    0031   FEB 11 2003 08:03:45   110     Health Check 1: Link Status Has
Changed on FCAL 2: linkFail: 0/0,   syncLost: 0/0, sigLost: 0/0, primSpErr:
0/0, invWord: 12/0, invCrc: 0/0
    0029   FEB 11 2003 08:03:03   70     NOTICE: Reboot Complete
    0028   FEB 11 2003 08:02:59   29     Mapping 1: Target Device Added:
index 3, handle 0x091ac308
    0027   FEB 11 2003 08:02:59   29     Mapping 1: Target Device Added:
index 2, handle 0x09e68408
    0023   FEB 11 2003 08:02:57   29     Mapping 1: Target Device Added:
index 1, handle 0x09ffbe08
    0022   FEB 11 2003 08:02:54   28     USCSI 3: Bus RESET
    0021   FEB 11 2003 08:02:53   28     USCSI 2: Bus RESET
    0016   FEB 11 2003 08:02:52   28     USCSI 1: Bus RESET
    0014   FEB 11 2003 08:02:51   29     Mapping 1: Target Device Added:
index 0, handle 0x091b8700
    0011   FEB 11 2003 08:02:51   28     USCSI 4: Bus RESET
    0010   FEB 11 2003 08:02:51   28     USCSI 3: Bus RESET
    0009   FEB 11 2003 08:02:51   28     USCSI 2: Bus RESET
    0008   FEB 11 2003 08:02:51   28     USCSI 1: Bus RESET
    0057   FEB 11 2003 08:02:20   13     NOTICE: System Shutting Down
    0031   FEB 10 2003 03:22:02   110     Health Check 1: Link Status Has
Changed on FCAL 2: linkFail: 0/0,   syncLost: 0/0, sigLost: 0/0, primSpErr:
0/0, invWord: 12/0, invCrc: 0/0
    0029   FEB 10 2003 03:21:20   70     NOTICE: Reboot Complete
    0028   FEB 10 2003 03:21:16   29     Mapping 1: Target Device Added:
index 3, handle 0x091ac408
    0027   FEB 10 2003 03:21:16   29     Mapping 1: Target Device Added:
index 2, handle 0x09e68408
    0023   FEB 10 2003 03:21:14   29     Mapping 1: Target Device Added:
index 1, handle 0x09ffbe08
    0022   FEB 10 2003 03:21:11   28     USCSI 3: Bus RESET

I am not specialist, but in my opinion there is only messages generated
during device rebooting.

Any help appreciated


Ing. Jozef Zatko
Login a.s.
Dlha 2, Stupava
tel.: (421) (2) 60252618



                    Humberto
                    Gomez Lopez          To:     ADSM-L AT VM.MARIST DOT EDU
                    <hugomez@UCAB        cc:
                    .EDU.VE>             Subject:     LTO 3583 drive
problem.
                    Sent by:
                    "ADSM: Dist
                    Stor Manager"
                    <ADSM-L AT VM DOT MA
                    RIST.EDU>


                    11.02.2003
                    14:54
                    Please
                    respond to
                    "ADSM: Dist
                    Stor Manager"






Hi,

  I have installed TSM 5.1.6 Server on W2k, and after configuring the
library and the drives and after checking in some tapes succesfully,
Submitting an audit library command I keep getting this error:


02/11/2003 09:37:08   ANR8304E Time out error on drive MT0.12.0.3
(mt0.12.0.3)
                       in library LB0.7.0.3.
02/11/2003 09:37:08   ANR8912E Unable to verify the label of volume from
                       slot-element 16 in drive MT0.12.0.3 (mt0.12.0.3) in
                       library LB0.7.0.3.
02/11/2003 09:37:08   ANR8302E I/O error on drive MT0.12.0.3 (mt0.12.0.3)
                       (OP=OFFL, Error Number=21, CC=0, KEY=02, ASC=3A,
ASCQ=00,

SENSE=70.00.02.00.00.00.00.1C.00.00.00.00.3A.00.00.00.10-

.13.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.04.0-

0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.-

00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00-

.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0-
                       0.00.00.00.00, Description=An undetermined error
has
                       occurred).  Refer to Appendix D in the 'Messages'
manual
                       for recommended action.
02/11/2003 09:37:10   ANR8942E Could not move volume NOT KNOWN from
slot-element
                       256 to slot-element 16.
02/11/2003 09:37:10   ANR9999D mmsscsi.c(9702): ThreadId<34> Volume may
still be
                       in the drive MT0.12.0.3 (mt0.12.0.3).
02/11/2003 09:37:10   ANR8446I Manual intervention required for library
                       LB0.7.0.3.

It always happens on the MTO.12.0.3 (the lowest drive),  Someone know what
could be happening?

  Thanks in advance.


"MMS <health-first.org>" made the following
 annotations on 02/11/2003 11:18:58 PM
------------------------------------------------------------------------------

This message is for the named person's use only.  It may contain
confidential, proprietary, or legally privileged information.  No
confidentiality or privilege is waived or lost by any mistransmission.  If
you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it, and notify
the sender.  You must not, directly or indirectly, use, disclose,
distribute, print, or copy any part of this message if you are not the
intended recipient.  Health First reserves the right to monitor all e-mail
communications through its networks.  Any views or opinions expressed in
this message are solely those of the individual sender, except (1) where
the message states such views or opinions are on behalf of a particular
entity;  and (2) the sender is authorized by the entity to give such views
or opinions.

==============================================================================

<Prev in Thread] Current Thread [Next in Thread>