TSM Cannot See Tape Drive - DRIVE03

rizhun

ADSM.ORG Member
Joined
Aug 15, 2006
Messages
20
Reaction score
0
Points
0
Website
Visit site
PREDATAR Control23

Hi,



Hope someone can help with this - really stuck.



I've got a backup server running on AIX v5.2 at ML 5200-02.

TSM Version 5, Release 3, Level 4.0



The server has a fibre channel connection to a SAN Data Gateway Router.

The library has a SCSI connection to the same SAN-DGR.

Green lights on all indicators.



There was a problem a couple of days ago that caused one of the drives to become unavailable.



The drive (rmt2) has been repaired.



[root@bkpsrv][/]$lsdev -Cc tape

rmt0 Available 10-70-01-0,0 IBM 3580 Ultrium Tape Drive (FCP)

rmt1 Available 10-70-01-1,0 IBM 3580 Ultrium Tape Drive (FCP)

rmt2 Available 10-70-01-2,0 IBM 3580 Ultrium Tape Drive (FCP)

rmt3 Available 10-70-01-3,0 IBM 3580 Ultrium Tape Drive (FCP)

rmt4 Available 10-89-00-5,0 LVD SCSI 4mm Tape Drive

rmt5 Defined 10-70-01 IBM 3590 Tape Drive and Medium Changer (FCP)

smc0 Available 10-70-01-6,0 IBM 3583 Library Medium Changer (FCP)



rmt0, rmt1, rmt2 and rmt3 are the drives in the library.

I'm unsure what rmt5 is and where it's come from, smc0 has always been the library.

As you can see, these are all 'Available'.



The 'dsmserv' daemon starts fine, and backups will run when using rmt0, rmt1 or rmt3.

/dev/rmt2 (or DRIVE03 to TSM) won't work.



A 'q drive f=d' shows the drive in an 'Unknown' state:



Library Name: ULT101

Drive Name: DRIVE03

Device Type: LTO

On-Line: Yes

Read Formats: ULTRIUMC,ULTRIUM

Write Formats: ULTRIUMC,ULTRIUM

Element: 258

Drive State: UNKNOWN

Volume Name:

Allocated to:

WWN:

Serial Number: 6811295059

Last Update by (administrator): CRAZYDAVE

Last Update Date/Time: 07/08/06 08:28:23

Cleaning Frequency (Gigabytes/ASNEEDED/NONE): NONE



A 'q path' shows the drive as on-line=no:



tsm: BKPSRV>q path



Source Name Source Type Destination Destination On-Line

Name Type

----------- ----------- ----------- ----------- -------

BKPSRV SERVER ULT101 LIBRARY Yes

BKPSRV SERVER DRIVE01 DRIVE Yes

BKPSRV SERVER DRIVE02 DRIVE Yes

BKPSRV SERVER DRIVE03 DRIVE No

BKPSRV SERVER DRIVE04 DRIVE Yes



I tried to 'update path' and got:



tsm: BKPSRV>update path BKPSRV DRIVE03 SRCType=SERVER DESTType=DRIVE LIBRary=ULT101 DEVIce=/dev/rmt2 ONLine=Yes

ANR8444E UPDATE PATH: Library ULT101 is currently unavailable.

ANS8001I Return code 15.



When I saw this I checked with a 'q library f=d':



tsm: BKPSRV>q library f=d



Library Name: ULT101

Library Type: SCSI

ACS Id:

Private Category:

Scratch Category:

WORM Scratch Category:

External Manager:

Shared: Yes

LanFree:

ObeyMountRetention:

Primary Library Manager:

WWN: 2001006045162554

Serial Number: IBM7810813

AutoLabel: Yes

Reset Drives: Yes

Last Update by (administrator): CRAZYDAVE

Last Update Date/Time: 10/08/06 12:46:56



As you can see from the previous 'q path' the Library seems to be visible.

I can also run a 'tapeutil -f /dev/smc0 inventory' and get the results back.



After this I looked at the activity log and found an ominous error about the ODM:



15/08/06 22:27:19 ANR8470W Initialization failure on drive DRIVE03 in

library ULT101. (PROCESS: 209)

15/08/06 22:27:19 ANR7871W Unable to complete odm query. Error message from

odm is 0519-004 libodm: The specified search criteria is

incorrectly formed. Make sure the criteria contains only

valid descriptor names and the search values are

correct. . (PROCESS: 209)

15/08/06 22:27:19 ANR9999D mmsscsi.c(7858): ThreadId<23> Unable to restore

library drive state for ULT101. (PROCESS: 209)

15/08/06 22:27:19 ANR9999D ThreadId<23> issued message 9999 from:

<-0x1001ba84 outDiagf <-0x10411fdc PerformInit

<-0x1040a788 BeginActivity <-0x1041cd40 ScsiMountVolume

<-0x1033cb5c MmsMountVolume <-0x1061f8ec LtoOpen

<-0x1027fba8 AgentThread <-0x1000e9e0 StartThread

<-0xd004b57c _pthread_body (PROCESS: 209)

15/08/06 22:27:19 ANR8441E Initialization failed for SCSI library ULT101.

(PROCESS: 209)



Thanks in advance,

Riz.
 
PREDATAR Control23

If you are sure that you don't have any 3590 drives, I would delete that entry (rmdev -dl rmt5) and then run (cfgmgr) to add/update your existing drives. This should either put rmt2 back where it should be (available), deconfigure rmt2 (it'll go to defined) and add a new rmt5 that is the old rmt2 or just add a new rmt5 without deconfiguring the old rmt2.



You should be able to verify that the old rmt2/new rmt5 device is the same by using (lscfg -vl rmt5) and look for the serial number.



What is your ATape level? A newer ATape version might help as well.



-Aaron
 
PREDATAR Control23

Thanks for the reply.



Ok, I've rmdev'd rmt5. Definatly no 3590 drives.

When I ran the cfgmgr, I got the following errors:



[root@bkpsrv][/]$rmdev -dl rmt5

rmt5 deleted

[root@bkpsrv][/]$cfgmgr

Method error (/etc/methods/cfgtsmdd -l mt0 ):

0514-051 Device to be configured does not match the physical

device at the specified connection location.

Method error (/etc/methods/cfgtsmdd -l mt1 ):

0514-051 Device to be configured does not match the physical

device at the specified connection location.

Method error (/etc/methods/cfgtsmdd -l mt2 ):

0514-051 Device to be configured does not match the physical

device at the specified connection location.

Method error (/etc/methods/cfgtsmdd -l mt3 ):

0514-051 Device to be configured does not match the physical

device at the specified connection location.

[root@bkp101][/]$



What are these mt devices?

A bit of a coincidence that there are four of them?!



My Atape.driver is at level 6.1.6.0.



Also, now several backups have attempted to run (overnight) I should add that it appears none of the tape drives are working.



I have also just seen a Migration process start, only to fail with 'Mount request denied' errors:



tsm: BKPSRV>q proc



Process Process Description Status

Number

-------- -------------------- -------------------------------------------------

537 Migration Disk Storage Pool BACKUPPOOL, Moved Files: 0,

Moved Bytes: 0, Unreadable Files: 0, Unreadable

Bytes: 0. Current Physical File (bytes):

1,310,720 Waiting for mount of output volume

PRO472 (23 seconds).



-- snippet from 'q actlog' process 537--



16/08/06 06:28:32 ANR0984I Process 537 for MIGRATION started in the

BACKGROUND at 06:28:32. (PROCESS: 537)



16/08/06 06:28:32 ANR1000I Migration process 537 started for storage pool

BACKUPPOOL automatically, highMig=0, lowMig=0,

duration=No. (PROCESS: 537)



16/08/06 06:28:59 ANR1401W Mount request denied for volume PRO249 - mount

failed. (PROCESS: 537)



16/08/06 06:29:25 ANR8470W Initialization failure on drive DRIVE03 in

library ULT101. (PROCESS: 537)



16/08/06 06:29:25 ANR7871W Unable to complete odm query. Error message from

odm is 0519-004 libodm: The specified search criteria is

incorrectly formed. Make sure the criteria contains only

valid descriptor names and the search values are

correct. . (PROCESS: 537)



16/08/06 06:29:25 ANR9999D mmsscsi.c(7858): ThreadId<24> Unable to restore

library drive state for ULT101. (PROCESS: 537)



16/08/06 06:29:25 ANR9999D ThreadId<24> issued message 9999 from:

<-0x1001ba84 outDiagf <-0x10411fdc PerformInit

<-0x1040a788 BeginActivity <-0x1041cd40 ScsiMountVolume

<-0x1033cb5c MmsMountVolume <-0x1061f8ec LtoOpen

<-0x1027fba8 AgentThread <-0x1000e9e0 StartThread

<-0xd004b57c _pthread_body (PROCESS: 537)



16/08/06 06:29:25 ANR8441E Initialization failed for SCSI library ULT101.

(PROCESS: 537)



16/08/06 06:29:25 ANR1401W Mount request denied for volume PRO472 - mount

failed. (PROCESS: 537)



Please help.



Thanks..
 
PREDATAR Control23

An mt device is a tape device that AIX doesn't have the drivers for. (non-IBM 4mm, 8mm, etc) It looks like AIX can see a device at the location but thinks it doesn't have the drivers for it (ATape are the drivers along with devices.scsi.tape/devices.fcp.tape)



You can also have a TSM tape drive defined as an mt device. With AIX and ATape, you don't need to define the drive as a TSM drive within AIX.



What you can try, is to record the physical serial number of each drive. Then remove all the rmt/mt entries. Then run cfgmgr to add them back. If you don't get any method errors, then look at each drive to get the serial number (they may have changed orders) update the path with the new rmt device number and then turn the path online. If you get any more method errors, its time to call IBM as something with the AIX systems isn't right (can't read the drives anymore)



-Aaron
 
PREDATAR Control23

Thanks heda - sorted.

I removed all the mt/rmt devices, updated ATape and ran cfgmgr.

It picked up all the correct drives which let me get all the paths fixed in TSM.



Online and running backups ;)



Thanks again.

Riz.
 
Top