ADSM-L

[ADSM-L] FW: Errno = 23, 1167 and stuck tapes

2009-07-21 09:39:05
Subject: [ADSM-L] FW: Errno = 23, 1167 and stuck tapes
From: Henrik Vahlstedt <SHWL AT STATOILHYDRO DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 21 Jul 2009 15:37:20 +0200
Hello,

Have anyone experience with Errno = 23, 1167 and stuck tapes or a suggestion 
what might cause the errors and how to solve them?
I get the errors in all kind of datamovement processes, D2T, T2T. Errors are on 
all drives randomly but not all the time.
That is, I can have 1 or 30 mounts before I get an error on a drive and the 
faulty tape mounts OK in another drive.
OS-, switch- and TSM logs etc does not provide any helpfull information.


W2k3 x64 sp2, TSM 5.5.2.1
5 LTO-4, SL500, 3 dual channel HBA´s connected to the SAN with one device per 
channel.
Lastest firmware drivers etc


First, err=1167, space reclamation mounts a tape and the drive disappear, why? 
However after some minutes TSM resurrect the drive and
continue to use it in new processes.
07/20/2009 05:53:21      ANR8337I LTO volume 4R0245 mounted in drive MT504
                          (mt0.0.0.2). (PROCESS: 59)
07/20/2009 05:53:42      ANR8311E An I/O error occurred while accessing drive 
MT504
                          (mt0.0.0.2) for WEOF operation, errno = 1167. 
(PROCESS:
                          59)
07/20/2009 05:53:42      ANR8311E An I/O error occurred while accessing drive 
MT504
                          (mt0.0.0.2) for OFFL operation, errno = 1167. 
(PROCESS:
                          59)
07/20/2009 05:53:43      ANR8469E Dismount of LTO volume 4R0245 from drive MT504


C:\>net helpmsg 1167
The device is not connected.


Event Type:     Error
Event Source:   PlugPlayManager
Event Category: None
Event ID:       12
Date:           7/20/2009
Time:           5:53:42 AM
User:           N/A
Computer:
Description:
The device 'IBM ULTRIUM-TD4 SCSI Sequential Device' 
(SCSI\Sequential&Ven_IBM&Prod_ULTRIUM-TD4&Rev_82F0\5&3652500d&0&000000) 
disappeared from the system without first being prepared for removal.
For more information, see Help and Support Center at 
http://go.microsoft.com/fwlink/events.asp.
Data:
0000: 00 00 00 00               ....


                   Volume Name: 4R0051
             Storage Pool Name: LTO4-BCK
             Device Class Name: LTO4
            Estimated Capacity: 1.6 T
       Scaled Capacity Applied:
                      Pct Util: 2.0
                 Volume Status: Filling
                        Access: Read-Only
        Pct. Reclaimable Space: 0.1
               Scratch Volume?: Yes
               In Error State?: No
      Number of Writable Sides: 1
       Number of Times Mounted: 8
             Write Pass Number: 1
     Approx. Date Last Written: 07/20/2009 04:00:47
        Approx. Date Last Read: 07/20/2009 21:26:28
           Date Became Pending:
        Number of Write Errors: 0
         Number of Read Errors: 0
               Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
         Last Update Date/Time: 07/20/2009 05:52:36
          Begin Reclaim Period:
            End Reclaim Period:
  Drive Encryption Key Manager: None



Second, err=23, a write errors generates error=23 and the tape is stuck. TSM 
nor Lbtest can remove the tape.
07/21/2009 02:23:29      ANR8337I LTO volume 4R0254 mounted in drive MT501
                          (mt0.0.0.5). (SESSION: 12497, PROCESS: 70)
07/21/2009 02:23:29      ANR1340I Scratch volume 4R0254 is now defined in 
storage
                          pool LTO4-BCK. (SESSION: 12497, PROCESS: 70)
07/21/2009 02:23:33      ANR0513I Process 70 opened output volume 4R0254. 
(SESSION:
                          12497, PROCESS: 70)
07/21/2009 02:24:16      ANR8944E Hardware or media error on drive MT501
                          (mt0.0.0.5) with volume 4R0254(OP=WRITE, Error Number=
                          23, CC=0, KEY=03, ASC=52, ASCQ=00,
                          
SENSE=71.00.03.00.00.00.00.58.00.00.00.00.52.00.36.00.78-
                          .D1.23.5D, Description=An undetermined error has
                          occurred). Refer to Appendix C in the 'Messages' 
manual
                          for recommended action. (SESSION: 12497, PROCESS: 70)
07/21/2009 02:24:16      ANR8359E Media fault detected on LTO volume 4R0254 in
                          drive MT501 (mt0.0.0.5) of library SL500. (SESSION:
                          12497, PROCESS: 70)
07/21/2009 02:24:16      ANR1411W Access mode for volume 4R0254 now set to
                          "read-only" due to write error. (SESSION: 12497, 
PROCESS:
                          70)
07/21/2009 02:24:16      ANR0515I Process 70 closed volume 4R0254. (SESSION: 
12497,
                          PROCESS: 70)
07/21/2009 02:24:37      ANR8944E Hardware or media error on drive MT501
                          (mt0.0.0.5) with volume 4R0254(OP=OFFL, Error Number= 
23,
                          CC=0, KEY=03, ASC=53, ASCQ=04,
                          
SENSE=70.00.03.00.00.00.00.58.00.00.00.00.53.04.36.00.2E-
                          .05.10.06, Description=An undetermined error has
                          occurred). Refer to Appendix C in the 'Messages' 
manual
                          for recommended action. (SESSION: 12497, PROCESS: 70)
07/21/2009 02:24:37      ANR8950W Device mt0.0.0.5, volume 4R0254 has issued the
                          following Warning TapeAlert: The operation has stopped
                          because an error has occurred while reading or writing
                          data which the drive cannot correct. (SESSION: 12497,
                          PROCESS: 70)
07/21/2009 02:24:37      ANR8948S Device mt0.0.0.5, volume 4R0254 has issued the
                          following Critical TapeAlert: Your data is at risk: 1.
                          Copy any data you require from this tape. 2. Do not 
use
                          this tape again.  3. Restart the operation with a
                     different tape. (SESSION: 12497, PROCESS: 70)
07/21/2009 02:24:37      ANR8949E Device mt0.0.0.5, volume 4R0254 has issued the
                          following Critical TapeAlert: The tape drive has a
                          hardware fault:  1. Eject the tape or magazine. 2. 
Reset
                          the drive.  3. Restart the operation. (SESSION: 12497,
                          PROCESS: 70)
07/21/2009 02:24:37      ANR8949E Device mt0.0.0.5, volume 4R0254 has issued the
                          following Critical TapeAlert: The operation has 
failed:
                          1. Eject the tape or magazine.  2. Restart the 
operation.
                          (SESSION: 12497, PROCESS: 70)
07/21/2009 02:24:37      ANR8950W Device mt0.0.0.5, volume 4R0254 has issued the
                          following Warning TapeAlert: The tape drive may have a
                          hardware fault.  Run extended diagnostics to verify 
and
                          diagnose the problem.  Check the tape drive users 
manual
                          for device specific instruction on running extended
                          diagnostic tests. (SESSION: 12497, PROCESS: 70)
07/21/2009 02:24:37      ANR8951I Device mt0.0.0.5, volume 4R0254 has issued the
                          following Information TapeAlert: The device has
                          encountered TapeAlert 56. (SESSION: 12497, PROCESS: 
70)
07/21/2009 02:24:57      ANR8469E Dismount of LTO volume 4R0254 from drive MT501
                          (mt0.0.0.5) in library SL500 failed. (SESSION: 12497,
                          PROCESS: 70)


C:\>net helpmsg 23
Data error (cyclic redundancy check).


                   Volume Name: 4R0254
             Storage Pool Name: LTO4-BCK
             Device Class Name: LTO4
            Estimated Capacity: 1.6 T
       Scaled Capacity Applied:
                      Pct Util: 0.1
                 Volume Status: Filling
                        Access: Read-Only
        Pct. Reclaimable Space: 0.0
               Scratch Volume?: Yes
               In Error State?: Yes
      Number of Writable Sides: 1
       Number of Times Mounted: 1
             Write Pass Number: 1
     Approx. Date Last Written: 07/21/2009 02:23:47
        Approx. Date Last Read: 07/21/2009 02:23:47
           Date Became Pending:
        Number of Write Errors: 1
         Number of Read Errors: 0
               Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
         Last Update Date/Time: 07/21/2009 02:23:29
          Begin Reclaim Period:
            End Reclaim Period:
  Drive Encryption Key Manager: None



Event Type:     Error
Event Source:   tsmscsi
Event Category: None
Event ID:       3
Date:           7/21/2009
Time:           2:24:57 AM
User:           N/A
Computer:
Description:
A check condition error has occured on device \Device\lb0.0.0.7 during Move 
Medium with completion code DD_CHANGER_FAILURE. Refer to the device's SCSI 
reference for appropriate action.

 Dump Data: byte 0x3E=KEY, byte 0x3D=ASC, byte 0x3C=ASCQ
Data:
0000: 0e 00 18 00 03 00 6c 00   ......l.
0008: 00 00 00 00 03 00 00 e0   .......à
0010: 30 01 00 00 85 01 00 c0   0......À
0018: 00 00 00 00 58 c0 01 84   ....XÀ."
0020: 00 00 00 00 00 00 00 00   ........
0028: 00 00 00 00 02 c4 a5 00   .....Ä¥.
0030: 84 03 00 00 80 16 8c 0b   "...?.O.
0038: 00 00 40 00 00 53 04 70   [email protected]

Event Type:     Error
Event Source:   tsmscsi
Event Category: None
Event ID:       3
Date:           7/21/2009
Time:           2:24:57 AM
User:           N/A
Computer:
Description:
A check condition error has occured on device \Device\lb0.0.0.7 during Move 
Medium with completion code DD_HARDWARE_MICROCODE. Refer to the device's SCSI 
reference for appropriate action.

 Dump Data: byte 0x3E=KEY, byte 0x3D=ASC, byte 0x3C=ASCQ
Data:
0000: 0e 00 18 00 03 00 6c 00   ......l.
0008: 00 00 00 00 03 00 00 e0   .......à
0010: d1 00 00 00 85 01 00 c0   Ñ......À
0018: 00 00 00 00 58 c0 01 84   ....XÀ."
0020: 00 00 00 00 00 00 00 00   ........
0028: 00 00 00 00 02 c4 a5 00   .....Ä¥.
0030: 84 03 00 00 80 16 8c 0b   "...?.O.
0038: 00 00 40 00 00 44 04 70   [email protected]


Tia
Henrik





-------------------------------------------------------------------
The information contained in this message may be CONFIDENTIAL and is
intended for the addressee only. Any unauthorised use, dissemination of the
information or copying of this message is prohibited. If you are not the
addressee, please notify the sender immediately by return e-mail and delete
this message.
Thank you.

<Prev in Thread] Current Thread [Next in Thread>