Orphaned tapes ?

waelti

ADSM.ORG Member
Joined
Apr 20, 2010
Messages
54
Reaction score
2
Points
0
Location
Germany
Hi,
at the moment we have 1210 LTO4-tapes overall. I think, that this is too much for our backup/archive.
We have two server, one for backup and one for archiving. We do a space relamation ervery day.
When I do a "q vol" and a "q drm" on both servers and generate a tape list and compare this list with our list of used tapes, I got an amount of 838 tapes, which doesn´t appear in "q vol" an d "q drm".
Are these tapes orphaned tapes or is there any data on it which I need ? Is there a command, where I can see all tapes known in the database with valid data ?
Can I use these tapes as scratch again ?
If these are unused tapes, how can this happen ?

Kind regards
Walter Reis
 
What you can do to verify that the 838 tapes are really orphaned is to get the list of all tapes per stgpool, tally the list with reults from 'q vol' and 'q drm'.

If you are 100% sure that all 838 tapes does not show in any of the stgpool tape listing, 'q vol' and 'q drm', these are then all orphaned tapes and can be used as scratch.

How can this happen? No one (assuming you did not administer this environment before) recalled the tapes when these became 'vault retrieve'. This happened in my old company - 730 tapes were recalled as scratch.
 
Hi moon-buddy,
every week I do the following:
"q drm wherestate=vaultretrieve" , then
"move drm * wherestate=vaultretrieve tostate=onsiteretrieve"
then I check those tapes in with "checkin libv lto4lib search=bulk checklabel=barcode status=scratch"
Isn´t that enough ?
 
Hi there

Orphaned tapes can be tricky to manage without a proper process and result will be unwanted deletion of your DATA on tapes as orphaned or black hole tapes (that's how i'm calling them).

Eithere they are not labelled at the begining of the process or have some issue that make TSM not recognize them (many reason not to explain here).

Here is "simply" what can be done to recover those tapes.

FIRST : Ensure that tapes doens't belong to any STGpool or its copy by comparing on both (or any) library the result of "q libv <library>"
If they are not either on primary library and its copy then you can follow the following process.

Advice is to work at the tape level but as i do, you can script it easily. THe idea is to clearly identify what happen on EACH & EVERY tape then if any error, it can be easily identified and fixed. Doing by lot could bring to an unwanted state = loosing data.

OPTION 1 : Tape are Free and neither used nor duplicated.
1: checkout <library> {tape} remove=no checklabel=no
2: label libvolume <library> search=yes labels=barcode checkin=scratch overwrite=yes vollist={tape}

OPTION 2 : Tape are Free but either used or duplicated.
1: Search for duplicated tapes: q actlog begint=now-01:00 search='ANR8808E' (Adjust time accordingly)

This will give you to which tape it is duplicated : ex for tape A0000 in actlog message will be something like
"
ANR8808E Could not write label A0000 on the volume in drive rmt1 (/dev/rmt1) of library <library> because that volume is already labeled with A12345 which is still defined in a storage pool or volume history. (SESSION: xxxx, PROCESS: xxxx)
"
Then we need to move data from A12345

2: move data A12345 wait=yes

then remove both tapes from library

3: checkout libv <library> remove=no checklabel=no vollist=A0000,A12345

then delete entry in volhistory to ensure tapes are free from primary library and its copy

4: delete volhistory todate=TODAY type=REMOTE volume=A0000 force=yes

Now we can securely label our tapes and make them available under TSM

5: label libvolume <library> search=yes labels=barcode checkin=scratch overwrite=yes vollist=A0000,A12345

Here we go, we now have to restart from the begining for the rest of the tapes.

I've written a script for our big environment within i left some manual intervention to ensure keeping the control on it. However, for smaller environment it can be totally scripted.

Keep in mind, that as said since the beginning, it is not probably the best process but it is ensuring that no data are lost, which is the main purpose of our job.

Hope it will help.
 
Hi moon-buddy,
every week I do the following:
"q drm wherestate=vaultretrieve" , then
"move drm * wherestate=vaultretrieve tostate=onsiteretrieve"
then I check those tapes in with "checkin libv lto4lib search=bulk checklabel=barcode status=scratch"
Isn´t that enough ?

Technically, if you move tapes to offsite daily and run reclamation daily, you should have lost of tapes that goes to Vault Retrive mode on a daily basis.

One scenario:

There are some admins that insist on running 'del volhist' even if DRM is in use. By running 'del volhist' more tapes are going into the orphaned pool if the tapes were not processed when these becomes vault retrieve. If have seen this before.

Another scenario:

Scripts run automatically to look at tapes needed to be retrieved but no one takes note of these. After some time, another script runs to put these to be 'onsite retrieve'. Since the tapes were not noted, the tapes never gets recalled. Again, I have seen this happen.
 
Back
Top