Veritas-bu

[Veritas-bu] Serious problem - DLT7000 tape drives gone "not functional"

2006-11-06 11:45:20
Subject: [Veritas-bu] Serious problem - DLT7000 tape drives gone "not functional"
From: mike.m.jackson at ca.mci.com (Mike Jackson)
Date: Mon, 06 Nov 2006 11:45:20 -0500
Hello all,

We're running a Solaris w/ NetBackup 5.0 master server environment with 
a SCSI attached StorageTek L700 library with eight DLT7000 drives.  We 
ran into an "event" the other night which cleared the L700 configuration 
which reset all of the drive SCSI ID's to Invalid.  I manually 
reconfigured the SCSI ID's 00 through 08 (skipping 07 which we cannot 
use).  SGSCAN sees the drives but when I try robtest or run manual 
backups the environment goes crazy and DOWN's all the drives.  I've got 
a support ticket opened with StorageTek but at this point they're not 
sure what the problem could be.  The LCD display on the L700 library 
says "NOT FUNCTIONAL" for all the drives even after a reboot.

Here's some information from sgscan / tpconfig && robtest:

[nb-master-01:ROOT](~): sgscan
..
/dev/sg/c10t5l0: Changer: "STK     L700"
..
/dev/sg/c2t0l0: Tape (/dev/rmt/0): "QUANTUM DLT7000"
/dev/sg/c2t1l0: Tape (/dev/rmt/1): "QUANTUM DLT7000"
/dev/sg/c4t2l0: Tape (/dev/rmt/2): "QUANTUM DLT7000"
/dev/sg/c4t3l0: Tape (/dev/rmt/3): "QUANTUM DLT7000"
/dev/sg/c6t4l0: Tape (/dev/rmt/4): "QUANTUM DLT7000"
/dev/sg/c6t5l0: Tape (/dev/rmt/5): "QUANTUM DLT7000"
/dev/sg/c8t6l0: Tape (/dev/rmt/6): "QUANTUM DLT7000"
[nb-master-01:ROOT](~):

[nb-master-01:ROOT](~): tpconfig -l
Device Robot Drive       Robot                    Drive 
Device         Second
Type     Num Index  Type DrNum Status  Comment    Name             Path 
           Device Path
robot      0    -    TLD    -       -  -          - 
/dev/sg/c10t5l0
   drive    -    0    dlt    3      UP  -          QUANTUMDLT70003 
/dev/rmt/2cbn
   drive    -    1    dlt    4      UP  -          QUANTUMDLT70004 
/dev/rmt/3cbn
   drive    -    2    dlt    5      UP  -          QUANTUMDLT70005 
/dev/rmt/4cbn
   drive    -    3    dlt    6      UP  -          QUANTUMDLT70006 
/dev/rmt/5cbn
   drive    -    4    dlt    7      UP  -          QUANTUMDLT70007 
/dev/rmt/6cbn
   drive    -    5    dlt    1      UP  -          QUANTUMDLT70001 
/dev/rmt/0cbn
   drive    -    7    dlt    2      UP  -          QUANTUMDLT70002 
/dev/rmt/1cbn

Robot selected: TLD(0)   robotic path = /dev/sg/c10t5l0

Invoking robotic test utility:
/usr/openv/volmgr/bin/tldtest -r /dev/sg/c10t5l0 -d1 /dev/rmt/0cbn -d2 
/dev/rmt/1cbn -d3 /dev/rmt/2cbn -d4 /dev/rmt/3cbn -d5 /dev/rmt/4cbn -d6 
/dev/rmt/5cbn -d7 /dev/rmt/6cbn

Opening /dev/sg/c10t5l0
MODE_SENSE complete
Enter tld commands (? returns help information)
s d
drive 1 (addr 500) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 1 is 0
drive 2 (addr 501) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 2 is 1
drive 3 (addr 502) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 3 is 2
drive 4 (addr 503) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 4 is 3
drive 5 (addr 504) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 5 is 4
drive 6 (addr 505) access = 0 Contains Cartridge = no
Sense code = 0x40, Code qualifier = 0x2
SCSI ID from drive 6 is 5
drive 7 (addr 506) access = 1 Contains Cartridge = yes
Source address = 1510 (slot 511)
Barcode = 000610
SCSI ID from drive 7 is 6
<< Press return to continue, or q and return to stop >>

drive 8 (addr 507) access = 1 Contains Cartridge = no
SCSI ID from drive 8 is 8
READ_ELEMENT_STATUS complete


Here's the logs when a backup is attempted:

Nov  6 11:16:51 nb-master-01 tldcd[7433]: TLD(0) key = 0x4, asc = 0x40, 
ascq = 0x2, UNKNOWN ERROR, KEY: 0x04, ASC: 0x40, ASCQ: 0x02
Nov  6 11:16:51 nb-master-01 tldcd[7433]: TLD(0) Move_medium error
Nov  6 11:16:51 nb-master-01 tldcd[7439]: TLD(0) cannot clear drive 4 
error, drive asc=0x40, ascq=0x2
Nov  6 11:16:51 nb-master-01 tldcd[7441]: TLD(0) cannot clear drive 3 
error, drive asc=0x40, ascq=0x2
tpconfig -lNov  6 11:16:51 nb-master-01 tldcd[7445]: TLD(0) cannot clear 
drive 5 error, drive asc=0x40, ascq=0x2
Nov  6 11:16:51 nb-master-01 tldd[7015]: TLD(0) drive 7 (device 4) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:51 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:51 nb-master-01 tldcd[7447]: TLD(0) cannot clear drive 6 
error, drive asc=0x40, ascq=0x2
Nov  6 11:16:51 nb-master-01 tldd[7015]: TLD(0) drive 4 (device 1) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:51 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:51 nb-master-01 tldd[7015]: TLD(0) drive 3 (device 0) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:51 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:51 nb-master-01 tldd[7015]: TLD(0) drive 5 (device 2) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:51 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:51 nb-master-01 tldd[7015]: TLD(0) drive 6 (device 3) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:51 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:52 nb-master-01 tldcd[7457]: TLD(0) cannot clear drive 1 
error, drive asc=0x40, ascq=0x2
Nov  6 11:16:52 nb-master-01 tldd[7015]: TLD(0) drive 1 (device 5) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:52 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:52 nb-master-01 tldcd[7466]: TLD(0) cannot clear drive 2 
error, drive asc=0x40, ascq=0x2
Nov  6 11:16:52 nb-master-01 ltid[6960]: Request for media ID 000610 is 
being rejected because the media appears to be unmountable
Nov  6 11:16:52 nb-master-01 tldd[7015]: TLD(0) bad media suspected; 
configuring device 4 back UP
Nov  6 11:16:54 nb-master-01 tldcd[7476]: TLD(0) key = 0x5, asc = 0x3a, 
ascq = 0x0, MEDIUM NOT PRESENT
Nov  6 11:16:54 nb-master-01 tldcd[7476]: TLD(0) Move_medium error
Nov  6 11:16:54 nb-master-01 tldd[7015]: TLD(0) drive 7 (device 4) is 
being DOWNED, status: Unable to SCSI unload drive
Nov  6 11:16:54 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media
Nov  6 11:16:55 nb-master-01 tldcd[7484]: TLD(0) cannot clear drive 2 
error, drive asc=0x40, ascq=0x2
Nov  6 11:16:55 nb-master-01 tldd[7015]: TLD(0) drive 2 (device 7) is 
being DOWNED, status: Robotic mount failure
Nov  6 11:16:55 nb-master-01 tldd[7015]: Check integrity of the drive, 
drive path, and media


Any help would be GREATLY appreciated!

Thanks!

   - Mike