TSM Tape expiry issue

Status
Not open for further replies.

prashant185

ADSM.ORG Member
Joined
Jun 24, 2009
Messages
16
Reaction score
0
Points
0
Hi Friends,

I am new in TSM, 4 weeks old. I am a Oracle DBA so understands the backups and sqls. I am having new responsibility to manage Tape backups and Tape rotation for library 3583.
Library can hold 60 tapes. We backup the filesystems files (windows and unix) and rman backup files on tapes. I have seen tape having all kind of backup pieces. Non db backup retention is 30 days.

Query 1 : There are tapes at offsite in vault state from last 3 months, but still library not asking to bring onsite. The LAST_WRITE_DATE value is very old on these tapes. What could be the reason library not asking these few tapes? is there way to force the library to expire the tapes with correct staging states. When i check the contents of these tapes, most of those are having filesystem backup pieces.

Query 2 : The tapes getting filled 100% still library not checking-out to send offsite for few days. why? is there way to force library to send the tape offsite as soon as it gets 100% full.

I am putting some required info .

TSM Server Version 5, Release 4, Level 0.0

STGPOOL_NAME__________ POOLTYPE DEVCLASS RECLAIM
---------------------- ------------------ ------------------ -------
CERN_DISK01_ARCH_SP PRIMARY DISK
CERN_DISK01_SP PRIMARY DISK
CERN_TAPE01_ARCH_CSP COPY CERN_TAPE01_DC 100
CERN_TAPE01_ARCH_SP PRIMARY CERN_TAPE01_DC 100
CERN_TAPE01_CSP COPY CERN_TAPE01_DC 100
CERN_TAPE01_SP PRIMARY CERN_TAPE01_DC 100


q drmstatus

Recovery Plan Prefix: /dr/tsm/drm/drmplan
Plan Instructions Prefix: /dr/tsm/drm/
Replacement Volume Postfix: @
Primary Storage Pools:
Copy Storage Pools:
Not Mountable Location Name: NOTMOUNTABLE
Courier Name: COURIER
Vault Site Name: VAULT
DB Backup Series Expiration Days: 4 Day(s)
Recovery Plan File Expiration Days: 30 Day(s)
Check Label?: Yes
Process FILE Device Type?: No
Command File Name:

All tapes can be used as scratch tapes once it is expire.

Volumes off-site:
Tape ID Status Date Time Library
---------- --------------- --------- --------- ---------------
A00033 Vault 06/27/09 08:35:57 -
A00036 Vault 02/07/09 08:32:43 -
A00037 Vault 02/20/09 08:39:36 -
A00038 Vault 02/20/09 08:39:36 -
A00041 Vault 12/31/08 08:44:21 -
A00042 Vault 03/19/09 10:40:39 -
A00043 Vault 02/22/09 00:55:16 -
A00044 Vault 03/18/09 08:34:21 -

Thanks
Prashant
 
For one, it does not seem that your DRM is setup correctly. You are missing "Primary Storage Pools" and "Copy Storage Pools". Unless you are performing these steps manually. Below is my drmstatus.
What could be the reason library not asking these few tapes? is there way to force the library to expire the tapes with correct staging states. When i check the contents of these tapes, most of those are having filesystem backup pieces.
The reason, as you stated, is the tapes still have data on them. The system will not request the tapes with data on them. You could perform a reclaimation on that storagepool and send the new tapes to vault.

Query 2 : The tapes getting filled 100% still library not checking-out to send offsite for few days. why? is there way to force library to send the tape offsite as soon as it gets 100% full.
Look into the command "move drm".

Do you have and automated command setup to copy your primary storage pool to a copy storage pool?

It would be beneficial for you to ready Chapter 24. "Using Disaster Recovery Manager" in the TSM Admin Guide.
http://publib.boulder.ibm.com/infoc....jsp?topic=/com.ibm.itsmcw.doc/anrwgd5502.htm


>q drmstatus

Recovery Plan Prefix: /tsm/drm/plan
Plan Instructions Prefix: /tsm/drm/instructions
Replacement Volume Postfix: d
Primary Storage Pools: BIGDISK
Copy Storage Pools: OFFSITE
Not Mountable Location Name: NOTMOUNTABLE
Courier Name: COURIER
Vault Site Name: Archives Security, Inc.
DB Backup Series Expiration Days: 7 Day(s)
Recovery Plan File Expiration Days: 15 Day(s)
Check Label?: No
Process FILE Device Type?: Yes
Command File Name:
 
If you don't specify primary and copy pools in DRM, TSM's DRM module will process all primary and copy pools. Specify only if you don't need all of them managed.
 
Hi onealsteel.

Thanks a lot.

Yes, it seems that previous associate has written some scripts to perform these steps manually and he has scheduled the job inside the tsm. it run everyday moring 8.30

Also command backup stgpool <primary_pool> <copy_pool> is being run by those scripts

meanwhile i will go through DRM chapter.

Our retention is 30days, so ideally all data on offsite tapes laying from last 1+ months are expired. I am suspecting something wrong in retention. How do I make sure that my retention is set correctly 30 days.

Just see if following information helps u to explain

tsm: TSM1_SRV>select * from domains
DOMAIN_NAME: CERN_HNAM_DM
SET_LAST_ACTIVATED: CERN_HNAM_PS
ACTIVATE_DATE: 2009-02-23 22:54:56.000000
DEFMGMTCLASS: CERN_HNAM_DISK_MC
NUM_NODES: 21
BACKRETENTION: 90
ARCHRETENTION: 90
DESCRIPTION: Cerner HNAM Domain.
CHG_TIME: 2009-02-23 22:54:56.000000
CHG_ADMIN: ADMIN
PROFILE:
ACTIVESTGPOOLS:
DOMAIN_NAME: CERN_PACS_DM
SET_LAST_ACTIVATED: CERN_PACS_PS
ACTIVATE_DATE: 2005-11-14 15:43:08.000000
DEFMGMTCLASS: CERN_PACS_TAPE_MC
NUM_NODES: 0
BACKRETENTION: 90
ARCHRETENTION: 90
DESCRIPTION: Cerner PACS Domain.
CHG_TIME: 2005-11-14 15:43:08.000000
CHG_ADMIN: ADMIN
PROFILE:
ACTIVESTGPOOLS:

tsm: TSM1_SRV>q drmsdtatus
ANR2000E Unknown command - QUERY DRMSDTATUS.
ANS8001I Return code 2.
tsm: TSM1_SRV>q drmstatus
Recovery Plan Prefix: /dr/tsm/drm/drmplan
Plan Instructions Prefix: /dr/tsm/drm/
Replacement Volume Postfix: @
Primary Storage Pools:
Copy Storage Pools:
Not Mountable Location Name: NOTMOUNTABLE
Courier Name: COURIER
Vault Site Name: VAULT
DB Backup Series Expiration Days: 4 Day(s)
Recovery Plan File Expiration Days: 30 Day(s)
Check Label?: Yes
Process FILE Device Type?: No
Command File Name:

Thanks
Prashant
 
BACKRETENTION is not what you think it is. This parameter is how long a file will remain in the system (available for restore) after they have been removed from the client.

command > h update domain

The above command will explain BACKRETENTION in more detail.

It depends on how your site is setup. I have run two different ways at my site. I have run it where the off-site tapes did not return until the data expired and also where we run a reclaimation on the off-site storage pool so the tapes will return when we want them.
 
Hi onealsteel,

You suggested to look for move drm command to move the 100% full volumes in next stage. But move drm can be used only on copy pool. At this moment these 100% tapes are in primary pool

Today total 10 tapes are 100% full in library in primary pool. How to push them to next stage?

Thanks
Prashant
 
BACKUP pri_pool to copy_pool failed. how to fix?

Hi Friends,

Backup from primary to copy pool failed with following error.

tsm: TSM1_SRV>BACKUP STGPOOL cern_tape01_sp cern_tape01_csp MAXPROCESS=1 PREVIEW=NO WAIT=YES
ANR0984I Process 415 for BACKUP STORAGE POOL started in the FOREGROUND at 13:38:22.
ANR2110I BACKUP STGPOOL started as process 415.
ANR1210I Backup of primary storage pool CERN_TAPE01_SP to copy storage pool CERN_TAPE01_CSP started as process 415.
ANR1221E BACKUP STGPOOL: Process 415 terminated - insufficient space in target storage pool CERN_TAPE01_CSP.
ANR0985I Process 415 for BACKUP STORAGE POOL running in the FOREGROUND completed with completion state FAILURE at 13:39:13.
ANR1214I Backup of primary storage pool CERN_TAPE01_SP to copy storage pool CERN_TAPE01_CSP has ended. Files Backed Up: 0, Bytes Backed Up: 0, Unreadable Files: 0, Unreadable
Bytes: 0.
ANS8001I Return code 4.

Library hold 60 tapes at a time, now 59 slots are used. 1 is empty.

as per tsm database
Primary pool has = 196 tapes
copy pool has = 59

MAXSCRATCH is not the issue.

There are 10 tapes are 100% full in primary pool and available in library.

It seems that my all copy pool volumes are showing access='OFFSITE'. I don't understand how this command will work if copy pool volumes are always on offsite?

How do i fix this?
 
Last edited:
Hi onealsteel,

You suggested to look for move drm command to move the 100% full volumes in next stage. But move drm can be used only on copy pool. At this moment these 100% tapes are in primary pool

Today total 10 tapes are 100% full in library in primary pool. How to push them to next stage?
You implied these tapes were your copy storage pool. It is ok if some of the tapes in the primary storage pool are 100% full. I have 125 volumes in my primary pool and 21 are 100% and another 33 are > 80% full. You need to be worry about having enough scratch tapes to perform backups and reclamations. You may be in a position where you have to check out some of your volumes to checkin scratch tapes so you can perform backups and reclamations.


How many scratch tapes do you have in the library? (select count(*) as Scratch from libvolumes where status = 'Scratch').

What is the status of your drm volumes? (q drm)

Also, do a q system. Output this to a text file and post it.
 
Hi onealsteel,

Your input helped me. Thanks a lot.

I was not having any scratch tapes, so today morning i checked-out the few 100% full tapes from primary pool and inserted the scratch tapes and TSM picked up the scratch tapes in copy pool and backup is going fine.

Now i can see that new scratch tapes under copy pool got filled 100%. How to send those on offsite so that i can insert more scratch tapes. At present library all slots are full. if i can send the filled tapes offsite then i can insert new scratch tapes. Please assist further.

I will run the reclaim on primary pool once backup process is over.

ALso I am sending system.out file.

The filled scratch tape status as below

VOLUME_NAME STGPOOL_NAME CAP GB ACCESS STATUS PCT_UTILIZED
------------------ ------------------ --------------------------------- ------------------ ------------------ ------------
A00123 CERN_TAPE01_CSP 531.51933593750000 READWRITE FULL 100.0
 

Attachments

  • system.out.txt
    109.9 KB · Views: 25
Once the backup of your primary storage pool is complete, you need to perform a TSM database backup (backup db dev=CERN_TAP-E01_DC type=full) because your database has significant change since the last backup. One of your administrative schedules may be setup to perform this task. After the db backup is complete you can remove the copy media.

Perform a "q drm wherestate=mountable" to see what which volumes need to come out.

Perform a “move drm * wherestate=mountable tostate=vault remove=bulk wait=yes”. This command will eject the tapes and set the status. Depending on the number of tapes and the speed of your library, this could take a couple of minutes.

Once all the tapes have been removed to be sent offsite you need to run the command “prepare”.

After you have everything under control, I suggest installing the TSM Management Console to setup some reports that will help you manage.
 
Hi onealsteel,

Thanks a lot. Everything went fine as per your instructions.

Only the backup command was asking few tapes in-between which were unavailable.
This library has quite few issues, i am listing here. let me know if you can help me to understand.

1. Tapes with “UNAVAILABLE” status in primary pool

I am suspecting, this is happening due to the manual check-out to make the slot available for new scratch tapes. There are 99 tapes like this in primary pool. The solution is i thought checkin these tapes back into the library as private, even I did check in 9 tapes like this and worked fine. but it is really not possible to do for 99 tapes. let me know if you have any idea on this.

2. Tapes with error
There are two tapes in primary pool having error flag “YES”, so read-write cannot happen on those tapes. We are losing 1TB space. J I tried to reset error flag and audited the tapes but it didn't help. let me know if there is any better way to fix this.

3. Tapes in Vault location with more than 3 months old
There are few tapes in vault location where the last read_write date is older than 3 months(even more old). My question is why the data on these tapes are not getting expired, our retention is 30days. I can see those having windows server files. Any clue on this?

4. Tapes in Library with last read_write date is very old
Why TSM is not writing on few tapes?


Thanks
Prashant
 
You are correct about the volumes that are unavailable.

You need to perform a reclamation on the primary storage pool and copy storage pool.

Here is a script to give you a list of volumes that have 90% free space.

select volume_name,stgpool_name,pct_reclaim from volumes where pct_reclaim>90 and status='FULL' order by stgpool_name

I would start with a 90% to see what you will get. (recl stg storage_pool_name th=90) If you want to be aggressive you could set it down to 50. The system may ask for tapes that are not in the library so you need to watch the activity log or start another session in console mode. (dsmadmc -id=admin -pass=password -console)

2.
The tapes with could be a couple of things. It could be that the volume has bad blocks or your tape drive needs to be cleaned. I would move the data from those volumes and take the tapes out of the library. (move data volume_name stg=storage_pool_name).

3.
Performing a reclamation on the offsite storage pool should cause those tapes to be recalled.

4.
TSM will write to a tape until it is considered full. It will not write to that tape until that tape is scratched by expiring or moving(move data or reclamation) the data.

I hope this helps.
 
Hi onealsteel,

You are great man!!!. It seems that my tape rotation issue is resolved. I hope i understand the tape rotation now. Thanks a lot.

One more query. There are certain jobs running every now and then but i am unable to find from where those scheduled. for example see the action log below. I am unable to figure out from where this command is being executed. There is no cronjob in the system starting at 8.

2009-07-08 08:00:19.000000 ANR2750I Starting scheduled command CERN_DB_MAINT_SC ( run cern_db_maint_cmd ). (SESSION: 254868)
2009-07-08 ANR2017I Administrator ADMIN issued command: RUN cern_db_maint_cmd


I am suspecting it must be running from TSM scheduler but unable to find out the exact command. is there any way i can find out?

Thanks
Prashant
 
These are admin schedules rather than client schedules, you need to specify an extra flag to see them compared to normal client schedules.

See: q sched t=a (add f=d for details).
 
great!!! Thanks BBB.
I found the schedule and schedule has command "run cern_db_maint_cmd" which must be be some script/function/procedure which will be having tsm commands. It will be great if you help me to know how to drill down further, how do i find the details about such script/function/procedure.

Schedule Name: CERN_DB_MAINT_SC
Command: run cern_db_maint_cmd
Priority: 2
Start Date/Time: 11/14/05 08:00:00
Duration: 20 Minute(s)
Schedule Style: Classic
Period: 1 Day(s)
Day of Week: Any
Month:
Day of Month:
Week of Month:
Expiration:
Active?: Yes
Last Update by (administrator): ADMIN
Last Update Date/Time: 11/14/05 16:00:21
Managing profile:

Thanks
Prashant
 
The TSM documentation it is pretty good at helping you with this stuff. Or use the online help - log into the administrative command line and run "help query" - this will show you all the query commands you can run and you can usually find what you are looking for in there. You can then run "help query script" which will then give you help about querying scripts. This is the command you want.
 
Status
Not open for further replies.
Back
Top