scratch tape problem

DaveB

Active Newcomer
Joined
Jun 4, 2010
Messages
6
Reaction score
0
Points
0
Location
Frankfurt a.M.
Hi,

I have a problem with my Tapepool. We're running a TSM Server 5.5 on Linux, and i get often the message "not enough scratch tapes". We use a Christie T48 Library with 21 Tapes inside. The other 33 Tapes are outsourced to a safe. Every week we run a script to checkout and checkin tapes from the tapepool but TSM will only reclaim two tapes from the safe. This two tapes are checked in as scratch but two tapes are not enough. After one or two days i get always the message "not enought scratch tapes". A year ago he wanted to reclaim four or more tapes and now only 2.
It seems that tsm use not all the tapes from the safe, because some tapes were last used before some months. but the expiration is set to 14 days and the maximum allowed scratch tape parameter is set to 15 and 10 are used.

Has someone an idea that could help me?
Thanks in advance an sorry for my bad english.

tsm: MOND>q stg
Storage Device Estimated Pct Pct High Low Next Stora-
Pool Name Class Name Capacity Util Migr Mig Mig ge Pool
Pct Pct
----------- ---------- ---------- ----- ----- ---- --- -----------
ARCHIVETAPE LIB01-LTO 0.0 M 0.0 0.0 90 70
COPYNDMP LIB01-LTO 20,000 G 4.8
COPYPOOL LIB01-LTO 16,089 G 22.5
DISKPOOL DISK 600 G 99.7 84.5 60 30 TAPEPOOL
NAS_DAY LIB01-LTO 5,972 G 12.5 25.0 90 70
NAS_DISK DISK 500 G 65.0 40.8 60 20 NAS_DAY
NAS_TOC DISK 30 G 14.5 14.5 90 70
TAPEPOOL LIB01-LTO 9,059 G 34.4 66.7 90 70
TOC_DISK DISK 10 G 19.6 19.6 60 20

tsm: MOND>q stg tapepool f=d
Storage Pool Name: TAPEPOOL
Storage Pool Type: Primary
Device Class Name: LIB01-LTO
Estimated Capacity: 9,059 G
Space Trigger Util:
Pct Util: 34.4
Pct Migr: 66.7
Pct Logical: 99.2
High Mig Pct: 90
Low Mig Pct: 70
Migration Delay: 0
Migration Continue: Yes
Migration Processes: 1
Reclamation Processes: 1
Next Storage Pool:
Reclaim Storage Pool:
Maximum Size Threshold: No Limit
Access: Read/Write
Description: Tapepool
Overflow Location:
Cache Migrated Files?:
Collocate?: No
Reclamation Threshold: 80
Offsite Reclamation Limit:
Maximum Scratch Volumes Allowed: 15
Number of Scratch Volumes Used: 10
Delay Period for Volume Reuse: 1 Day(s)
Migration in Progress?: No
Amount Migrated (MB): 0.00
Elapsed Migration Time (seconds): 0
Reclamation in Progress?: No
Last Update by (administrator): ADMIN
Last Update Date/Time: 20.05.2010 08:25:15
Storage Pool Data Format: Native
Copy Storage Pool(s):
Active Data Pool(s):
Continue Copy on Error?: Yes
CRC Data: No
Reclamation Type: Threshold
Overwrite Data when Deleted:
 
One cause could be the high reclaim threshold setting for stgp "tapepool". It's currently set to 80 % which means that TSM server will only attempt to reclaim those tape volumes that have status=full and are (logically) used with 20 % or lower.

If you change this reclaim setting to 50% then TSM will recliam more tapes.

Do you have a admin schedule/script that runs reclaim periodiclly (like one process per day) and what threshold has it?

...

Also a cause is probably that a lot of old tapes are not avalible in the library so they can't be reclaimed until they are physically avalible.

... ref "Every week we run a script to checkout and checkin ..." Question: do you remove all new tapes and checkin the 21 (of which 33?) old tapes?


Regards,
Nicke
 
Hi Nicke,

Thanks for your answer!

I set the Reclamation Threshold from 90 to 80 two months ago. But nothing has changed. Did you think i should try less than 80?
Yes we run a admin schedule/script. Is there a separate value for Reclamation Threshold? If so, how do i change it? Sorry, i'm new with tsm because our TSM-admin has left the company.
We remove and import every week two tapes to the library via the command "run checkin" and "run checkout". Previously (6 months) there were still more than 4 tapes and always enough for a week.

br
Dave
 
You can lower the tapepool reclaim threshold value to 60;

update stgp tapepool reclaim=60


Then it's good to run a reclaim administrative sched/script often. Also keep in mind that the reclaim funktion depends on "expiration"... It's normaly good to run "expire inventory" every day (or week day).

In TSM 5.5 and later you can run recalim direct (not have to change thresholds in the stgpool).

Ex: reclaim stgp tapepool th=50 duration=240


... during the reclaim process is running check the actlog to see which volumes TSM wants to reclaim. If the required tape isn't in the library then you have to check it in (or change the tape volume setting before you check out the tapes)



//Nicke
 
I think Nickie is on target. I have my reclamation threshold set to 100 so that I can control when the reclamation occurs. I have an admin schedule that kicks off reclamation for the Tapepool run with a threshold of 60 and a duration of 4 hours and one for a copypool that has the same parameters. You may have to adjust these depending on your situation. You may want to review the help file. Type in h reclaim stg at an admin command line.
Commands:
reclaim stg tapepool thresh=60 dur=240
reclaim stg copypool thresh=60 dur=240
Good luck
 
You can also run the following select statements to see the utilization of your tapes.
select volume_name, pct_reclaim, pct_utilized, status, access from volumes where stgpool_name = 'TAPEPOOL' order by pct_reclaim desc, pct_utilized

select volume_name, pct_reclaim, pct_utilized, status, access from volumes where stgpool_name = 'COPYPOOL' order by pct_reclaim desc, pct_utilized
 
I set the reclaim threshold to 60 and look what happen. We run the reclaim process every morning.

@fuzzballer
thank for your reply.
I run the select statement and get the following result:

VOLUME_NAME: 000053L3
PCT_RECLAIM: 50.4
PCT_UTILIZED: 50.0
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000043L3
PCT_RECLAIM: 47.2
PCT_UTILIZED: 53.4
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000041L3
PCT_RECLAIM: 43.5
PCT_UTILIZED: 57.0
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000011L3
PCT_RECLAIM: 39.1
PCT_UTILIZED: 61.6
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000023L3
PCT_RECLAIM: 33.0
PCT_UTILIZED: 19.0
STATUS: FILLING
ACCESS: READWRITE

VOLUME_NAME: 000029L3
PCT_RECLAIM: 28.4
PCT_UTILIZED: 72.1
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000036L3
PCT_RECLAIM: 24.7
PCT_UTILIZED: 75.5
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000000L3
PCT_RECLAIM: 21.4
PCT_UTILIZED: 78.9
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000046L3
PCT_RECLAIM: 20.7
PCT_UTILIZED: 79.6
STATUS: FULL
ACCESS: READWRITE

VOLUME_NAME: 000040L3
PCT_RECLAIM: 13.8
PCT_UTILIZED: 1.2
STATUS: FILLING
ACCESS: READONLY

It's very strange because a q libv results

tsm: MOND>q libv
Library Name Volume Name Status Owner Last Use Home Device
Element Type
------------ ----------- ---------------- ---------- --------- ------- ------
LIB01 000000L3 Private Data 1,009
LIB01 000002L3 Private Data 1,013
LIB01 000006L3 Private Data 1,001
LIB01 000009L3 Private BackupSet 1,004
LIB01 000011L3 Private Data 1,011
LIB01 000023L3 Private Data 1,008
LIB01 000025L3 Private DbBackup 1,018
LIB01 000029L3 Private Data 1,005
LIB01 000034L3 Private Data 1,010
LIB01 000036L3 Private Data 1,007
LIB01 000038L3 Private BackupSet 1,002
LIB01 000040L3 Private Data 1,017
LIB01 000041L3 Private Data 1,019
LIB01 000043L3 Private Data 1,003
LIB01 000046L3 Private Data 1,006
LIB01 000053L3 Private Data 1,012
LIB01 000054L3 Private Data 1,014

I hope i get the tsm to run smoothly :)
 
sorry i frogot,

tsm: MOND>select volume_name, pct_reclaim, pct_utilized, status, access from volumes where stgpool_name = 'COPYPOOL' order by pct_reclaim desc, pct_utilized

VOLUME_NAME: 000033L3
PCT_RECLAIM: 94.5
PCT_UTILIZED: 6.0
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000020L3
PCT_RECLAIM: 91.1
PCT_UTILIZED: 8.9
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000007L3
PCT_RECLAIM: 88.0
PCT_UTILIZED: 11.9
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000005L3
PCT_RECLAIM: 86.7
PCT_UTILIZED: 13.2
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000048L3
PCT_RECLAIM: 81.1
PCT_UTILIZED: 19.3
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000047L3
PCT_RECLAIM: 77.8
PCT_UTILIZED: 22.1
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000031L3
PCT_RECLAIM: 76.6
PCT_UTILIZED: 23.6
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000035L3
PCT_RECLAIM: 48.0
PCT_UTILIZED: 52.6
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000052L3
PCT_RECLAIM: 47.3
PCT_UTILIZED: 53.3
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000051L3
PCT_RECLAIM: 41.7
PCT_UTILIZED: 58.4
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000054L3
PCT_RECLAIM: 35.7
PCT_UTILIZED: 64.2
STATUS: FILLING
ACCESS: OFFSITE

VOLUME_NAME: 000015L3
PCT_RECLAIM: 35.6
PCT_UTILIZED: 64.8
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000010L3
PCT_RECLAIM: 24.7
PCT_UTILIZED: 75.7
STATUS: FULL
ACCESS: OFFSITE

VOLUME_NAME: 000002L3
PCT_RECLAIM: 0.6
PCT_UTILIZED: 99.3
STATUS: FULL
ACCESS: OFFSITE
 
If I read your q libv correctly you only have 4 slots available for scratch tapes. 17 private tapes and 4 scratch. I would guess that you use 1 data tape and 1 DBBackup tape per day. So in 2 days you would run out of scratch tapes and hence receive the warning message about not enough scratch. It appears to me that you have to remove tapes offsite and add scratch tapes daily in order to stay ahead. Is this correct or did I miss something?
 
You also have tape 000040L3 as read only, might want to check why as this tape is only 1.2% utilised so can hold a lot more data on it or get the data moved off and get the tape out the library so another scratch tape can be used.

As fuzzballer suggests you only have enough free slots for 4 scratch tapes, this will only keep you going for a couple of days.
 
Good Morning,

@fuzzballer
Yes, thats correct. But the tapes with the status "offsite" are all in the safe. Should i checkin some tapes from the safe with the "checkin libvolume" and scratch status? But i dont know on which tapes the data is expired.

@cheeky
How can i change the vol status from readonly to readwrite?

Is it sufficient if i change the "Reclamation Threshold" with the update stgp command or i have to change it in the script too? if so, how can i find/edit the script?

Many thanks for your help!
 
Did you run the reclaim stg command from above on your copypool? reclaim stg copypool thresh=60 dur=240
What were the results? You should have had at least 7 tapes in Pending or Vaultretrieve status. Check the act log after it runs and see what it says about the reclamation process.
 
hi,

i checked in 5 scratch tapes yesterday. today i get the message again "there are not enough scratch tapes ... 1<5) the reclaim process has started at 6:15 am by the script and is running now. I will check when the process has finished.
 
@cheeky
How can i change the vol status from readonly to readwrite?

First you need to check why it has gone read only, if you can't check the activity to go far back enough then check the volume for the number of read/write errors against it, issue q vol 000040L3 f=d

If read/write error count is only one or two then issue update 000040L3 access=readw
 
Back
Top