TSM Asking for Checkin of Offsite Tapes

JNewton

ADSM.ORG Member
Joined
Dec 11, 2008
Messages
19
Reaction score
0
Points
0
Hey all,

I recently upgraded from LTO1 tape drives to LTO3 tape drives. After completion, everything looked great until I tried running Space Reclamation on my Offsite pool (Copypool that uses DRM to go offsite). Whenever reclamation runs on the offsite pool, it requests that the tape be checked in, then once the 5 minute timeout reaches, it goes unavailable. It does this with every tape in the Offsite pool. Prior to upgrading, I was always able to run reclamation on the offsite pool after expiring with no issue. Do I really need to bring back all of my offsite tapes back on site just to run space reclamation and get them migrated out?

I tried running an audit library but it fails every time.

I have tried doing a checkin libvol with search set to yes but that checks in 0 volumes.

Does anyone have any thoughts?

I am using a 3584 Library.
 
-------------------- ----------------------------------------------------------
05/04/2009 17:53:54 ANR8941W The volume from slot-element 1096 in drive DRIVE1
(mt0.0.0.2) in library 3584LIB is blank. (SESSION: 25654,
PROCESS: 172)

05/04/2009 17:58:59 ANR8944E Hardware or media error on drive DRIVE4
(mt3.0.0.2) with volume (OP=TESTREADY, Error Number= 23,
CC=0, KEY=03, ASC=53, ASCQ=00,
SENSE=70.00.03.00.00.00.00.58.00.00.00.00.53.00.36.00.2E-
.02.FF.E7.00.02.20.20.20.20.20.20.20.00.00.00.45.98.5F.F-
9.00.00.00.00.00.00.00.00.00.00.00.00.00.00.12.02.00.00.-
00.00.00.00.00.60.00.00.00.00.70.00.03.00.00.00.00.58.00-
.00.00.00.53.00.36.00.2E.02.FF.E7.00.02.20.20.20.20.20.2-
0.00.00.00, Description=An undetermined error has
occurred). Refer to Appendix D in the 'Messages' manual
for recommended action. (SESSION: 25654, PROCESS: 172)

05/04/2009 17:58:59 ANR8304E Time out error on drive DRIVE4 (mt3.0.0.2) in
library 3584LIB. (SESSION: 25654, PROCESS: 172)
05/04/2009 17:58:59 ANR8912E Unable to verify the label of volume from
slot-element 1087 in drive DRIVE4 (mt3.0.0.2) in library
3584LIB. (SESSION: 25654, PROCESS: 172)

05/04/2009 17:58:59 ANR8302E I/O error on drive DRIVE4 (mt3.0.0.2) with volume
(OP=OFFL, Error Number=21, CC=0, KEY=02, ASC=04,
ASCQ=02, SENSE=70.00.02.00.00.00.00.58.00.00.00.00.04.02-
.37.00.10.12.00.00.00.02.20.20.20.20.20.20.20.00.00.00.4-
6.D0.60.06.00.00.00.00.00.00.00.00.00.00.00.00.00.00.12.-
02.00.00.00.00.00.00.00.60.00.00.00.00.70.00.02.00.00.00-
.00.58.00.00.00.00.04.02.37.00.10.12.00.00.00.02.20.20.2-
0.20.20.20.00.00.00, Description=An undetermined error
has occurred). Refer to Appendix D in the 'Messages'
manual for recommended action. (SESSION: 25654, PROCESS:
172)

05/04/2009 17:58:59 ANR8948S Device mt3.0.0.2, volume unknown has issued the
following Critical TapeAlert: Your data is at risk: 1.
Copy any data you require from this tape. 2. Do not use
this tape again. 3. Restart the operation with a
different tape. (SESSION: 25654, PROCESS: 172)

05/04/2009 17:58:59 ANR8949E Device mt3.0.0.2, volume unknown has issued the
following Critical TapeAlert: The operation has failed:
1. Eject the tape or magazine. 2. Restart the operation.
(SESSION: 25654, PROCESS: 172)
05/04/2009 17:58:59 ANR8948S Device mt3.0.0.2, volume unknown has issued the
following Critical TapeAlert: The operation has failed
because the media cannot be loaded and threaded. 1.
Remove the cartridge, inspect it as specified in the
product manual, and retry the operation. 2. If the
problem persists, call the tape drive supplier help line.
(SESSION: 25654, PROCESS: 172)

05/04/2009 17:59:24 ANR8460E AUDIT LIBRARY process for library 3584LIB failed.
(SESSION: 25654, PROCESS: 172)

05/04/2009 17:59:24 ANR0985I Process 172 for AUDIT LIBRARY running in the
BACKGROUND completed with completion state FAILURE at
17:59:24. (SESSION: 25654, PROCESS: 172)
 
Looks to as if TSM is having issues reading internal label. This could be from media or from tape drive.

Did you tell it in the audit libr to use barcode?
 
I have tried auditing the library using barcodes numerous times, but every time I do, it completes immediately after I enter the command with success. It doesn't go through and check the barcodes. The only way that I can get it to actually attempt to audit is by auditing the library by reading volumes.
 
I have tried auditing the library using barcodes numerous times, but every time I do, it completes immediately after I enter the command with success. It doesn't go through and check the barcodes. The only way that I can get it to actually attempt to audit is by auditing the library by reading volumes.


Is it failing on the first tape it mounts? are other process working and mounting tapes?

If all other process seem to be working and you can identify that tape in slot slot-element 1087 is causing issue then audit that volume or mark tape as destroyed and rebuild from offsite volumes.

if all other process are seeing similar errors and its failing on first tape mount most likely comm issues with library. DID YOU UPDATE IBM DRIVER? when upgrading TSM? You can delete all paths and drives and library and redefine them. You will need to checkin all volumes after you delete library with private first.

also power cycling instance and library has solved weird issues for me.
 
I believe doing the audit with the barcode option essentially accesses the list from the library, as opposed to actually scanning everything in the library physically (at least that is how it acts with our 3584 library). Perhaps there is a way to prompt the library itself to audit the tapes inside. Although, it sounds as if you need to take a physical look at the tape in that element.
 
Is it failing on the first tape it mounts? are other process working and mounting tapes?

If all other process seem to be working and you can identify that tape in slot slot-element 1087 is causing issue then audit that volume or mark tape as destroyed and rebuild from offsite volumes.

if all other process are seeing similar errors and its failing on first tape mount most likely comm issues with library. DID YOU UPDATE IBM DRIVER? when upgrading TSM? You can delete all paths and drives and library and redefine them. You will need to checkin all volumes after you delete library with private first.

also power cycling instance and library has solved weird issues for me.

It ran from 5:45 to 5:53 (time of first error). So it seems to have gotten through a few tapes before getting the error. I will audit the library again with me up there watching to see if I can see which tape it errors out on, since my library doesn't seem to show which slot element a tape is in. Thanks for the help thus far.
 
It ran from 5:45 to 5:53 (time of first error). So it seems to have gotten through a few tapes before getting the error. I will audit the library again with me up there watching to see if I can see which tape it errors out on, since my library doesn't seem to show which slot element a tape is in. Thanks for the help thus far.


You should be able to identify tape from slot 1087 from front panel of library at least most library you could or provided interface software that would allow you too.
 
I watched the audit library and got rid of all of the tapes causing the process to fail, and weeded those tapes out. I now have gotten the audit library to complete successfully, but space reclamation on the offsite pool is still requesting that the tapes be checked in. I tried checking in the tape when it requests it but it still does the same thing:

ANR8431I CHECKIN LIBVOLUME process completed for library 3584LIB; 0 volume(s)
found.
ANR0985I Process 220 for CHECKIN LIBVOLUME running in the BACKGROUND completed
with completion state SUCCESS at 13:13:24.

I know that the tapes are offsite but I thought you were still supposed to be able to reclaim space while they are gone as to allow the tape to marked as scratch and start the rotation over again.
 
I know that the tapes are offsite but I thought you were still supposed to be able to reclaim space while they are gone as to allow the tape to marked as scratch and start the rotation over again.

Are you sure that the tapes being asked to be checked in are offsite tapes and not from the online tape pool?
 
Last edited:
Check your volhist to see the status of the requested tapes and where it should be (offsite/onsite/etc.).


Mike
 
Are you sure your DEVCLASS is correct, after your upgrade from LTO1 to LTO3? Run the following command for one of your tape drives, and see what you get:

q dr your_library_name one_of_your_drives f=d

The output should look like this (if your DEVCLASS is correct for LTO3 drives):

Library Name: your_library_name
Drive Name: one_of_your_drives
Device Type: LTO
On-Line: Yes
Read Formats: ULTRIUM3C,ULTRIUM3,ULTRIUM2C,ULTRIUM2,ULTRIUMC,ULTRIUM
Write Formats: ULTRIUM3C,ULTRIUM3,ULTRIUM2C,ULTRIUM2

Remember, LTO3 tape drives can only READ from LTO1 tapes. They cannot WRITE to them.
 
Have you any tapes that are UNAVAILABLE?

> q vol access=unavailable

If some of your onsite tapes are unavailable then TSM will need the offsite tapes to reclaim offsite tapes.

I have also seen that for some reason TSM wants the offsite tape to restore data, and have run an audit on the onsite volume and this sorts the issue out. So if you query the contents of an offsite volume that is failing reclamation and find the onsite tape that contains that data. Run and audit volume on the onsite tapes and see if this helps
 
We currently have no tapes unavailable in our onsite storage pool. We just have the offsite tapes that go unavailable when I try to run reclamation processing.

Dennis: Yes, all of that looks right. I have my LTO1's set as read/only as to not let the LTO3 drives attempt to write to them. It will only write to the LTO3 tapes that I put in.

I think I may know what the problem is however. When I upgraded the drives from LTO1 to LTO3, I deleted all of the paths, drives, etc. and stupidly decided to delete the library too to make sure nothing would leftover from the LTO1 drives that could cause troubles with my upgrade. I am a bit of an amateur with TSM admittedly After bringing everything back online, I had no volumes checked in. So I went through and did a checkin with search=yes and got all of the tapes checked back in, but this was only the tapes in the library. I try this with the offsite volumes and the search comes back with not finding any volumes because they are not physcially in the library.

So....The tapes that are offsite are no longer checked in and cannot be checked in because I deleted the library and recreated it, correct?

If this is the case, is there any way possible (such as restoring a volhist file or entering volumes into it) to not have to bring back all of the offsite tapes and check them in all over again?

If I do have to bring them all back and manually check them in again, would that mess up DRM, considering that TSM has the volumes listed as VAULT?

Would I need to assign all of the DRM tapes as onsite and restart the DRM process all over again?
 
Last edited:
Back
Top