TSM marks tapes full at about 75% of real capacity

RobinR

Active Newcomer
Joined
Jan 29, 2015
Messages
5
Reaction score
0
Points
0
PREDATAR Control23

I have a TSM 6.4 installation backing up to a Protectier VTL containing 3500+ cartridges of 50Gb (capacity reported by the VTL per cartridge is 51200MB)
But TSM marks tapes as 100% full at occupancy levels varying between 30 to 50GB. Estimated tape size in TSM is also set to 51200MB and so are the estimated sizes of scratch tapes in TSM.

I don't know if it is actually related but we where coming from a configuration where the tape sizes where set to 300 Gb but due to the long reclamation processes we decided to replace all cartridges with 50Gb as recommended by IBM (they recommend 50Gb to 100Gb cartridges in a VTL) and removing the 300Gb cartridges as they became free. However during this process the VTL has been temporarily overcommitted to be able to easily automate this process and the actual cartridge size was hereby temporarily reduced to 30Gb/cartridge by VTL, but now that all old 300Gb tapes are all removed the capacity/cartridge is now correctly at 51200MB/cartridge according to the VTL..

I have been moving tape data within the library for all tapes marked as full with an estimated size < 40GB to be moved to new scratch tapes hoping they would then fill up to 50Gb.. And by the end of the day the actual amount of tapes marked as full with an estimated capacity of less than 40Gb is almost zero and I get a lot of extra scratch tapes.. However, the next day (after nightly backups where taken) there are always 300+ cartridges again marked as full with only an estimated capacity between 30 and 40Gb... Leaving me with a lot of 'wasted' space unavailable for TSM until these tapes' data are again moved within the library to scratch tapes.

What makes TSM marking those tapes a full before they reach an occupancy of 50Gb, the actual capacity of the cartriges? Is the VTL incorrectly reporting the tapes as full to TSM? Or does TSM 'remember' that it could only place about 30 to 40 gb on those tapes previous time, and now also marks them full, despite the fact that it reports 50Gb estimated capacity when they are in scratch state..
I have currently no idea how to find answers to those questions, and how to make TSM fill all tapes up to their full capacity..

We are not using colocating, TSM dedup or compression. Dedup is handled by the VTL and the library type for this VTL in TSM is set as VTL.

Does anybody know what is happening here ?
 
PREDATAR Control23

q devclass shows me nothing about compression:
Code:
             Device Class Name: TS7650_LTO
        Device Access Strategy: Sequential
            Storage Pool Count: 1
                   Device Type: LTO
                        Format: ULTRIUM3
         Est/Max Capacity (MB): 51.200,0
                   Mount Limit: DRIVES
              Mount Wait (min): 10
         Mount Retention (min): 10
                  Label Prefix: ADSM
                       Library: TS7650
                     Directory:
                   Server Name:
                  Retry Period:
                Retry Interval:
                        Shared:
            High-level Address:
              Minimum Capacity:
                          WORM: No
              Drive Encryption: Allow
               Scaled Capacity:
       Primary Allocation (MB):
     Secondary Allocation (MB):
                   Compression:
                     Retention:
                    Protection:
               Expiration Date:
                          Unit:
      Logical Block Protection: No
Last Update by (administrator): ADMIN
         Last Update Date/Time: 08.01.2015 12:05:13

q vol of one of the 'full' cartridges:
Code:
  Volume Name: 000953L3
  Storage Pool Name: PRIMVTLPOOL
  Device Class Name: TS7650_LTO
  Estimated Capacity: 26,3 G
  Scaled Capacity Applied:
  Pct Util: 100,0
  Volume Status: Full
  Access: Read/Write
  Pct. Reclaimable Space: 0,0
  Scratch Volume?: Yes
  In Error State?: No
  Number of Writable Sides: 1
  Number of Times Mounted: 1
  Write Pass Number: 1
  Approx. Date Last Written: 29.01.2015 14:45:54
  Approx. Date Last Read: 29.01.2015 14:28:38
  Date Became Pending:
  Number of Write Errors: 0
  Number of Read Errors: 0
  Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator):
  Last Update Date/Time: 29.01.2015 14:28:38
  Begin Reclaim Period:
  End Reclaim Period:
  Drive Encryption Key Manager: None
  Logical Block Protected: No
 
PREDATAR Control23

I had the same issue myself and been looking into it for several weeks. In short: don't get yourself fooled by the cartridge capacity you set in the ProtecTier software during repository sizing. As stated in IBM documentation, 'cartridge capacity changes to reflect the system-wide shift in nominal capacity'. It means, if your total dedupe ratio drops, the capacity of newly allocated cartridges will drop, too.

A 'cartridge' is a purely virtual object in ProtecTier terms. Backup application has no idea what cartridge capacity is until it receives an "End of Tape" (EW) signal. You can find more details under 'Managing capacity fluctuations' in ProtecTier RedBook.
 
PREDATAR Control23

A 'cartridge' is a purely virtual object in ProtecTier terms. Backup application has no idea what cartridge capacity is until it receives an "End of Tape" (EW) signal. You can find more details under 'Managing capacity fluctuations' in ProtecTier RedBook.

I know. I saw this behaviour when I overcommitted the VTL to move all data from 300Gb tapes to 50Gb tapes.. But currently the dedup ratio is stable for weeks (2.9 to 3.1) and the cartridges are not only limited to 50Gb, VTL also reports their capacity as effectivly being 50Gb (where it before, during the overcommiting, reported much less due to too many tapes). And I can't believe that the dedup would drop so much overnight that tape capacity is reduced to (last night) 20GB and then climb up again by the morning so that the capacity is again 50Gb..
Or does the VTL also lower the cartridge capacity when much data on the VTL is in 'pending' state. At points where more than 20-30TB is in Pending state, the VTL reports the estimated total space also about 8 to 10 TB higher then when there is almost no data in pending state. But this should not decrease cartridge capacity, in my eyes.. (the gui at that point still reports the cartridge capacity as 50Gb)
 
PREDATAR Control23

One question:

Is the tape marked FULL after a write operation from what data source? Is the data written to this "tape" compressed?

As already mentioned by marclant, data writing behavior depends solely on the kind of data being written. The end device, in this case the ProtecTier, 'looks ahead' to determine if the 'projected capacity', say 50 GB, is enough to hold the data. If the 'tape' does not have the capacity, it will mark it as full and TSM will enter a meta data marker for it in the TSM database. TSM then picks up another tape.

I have seen this behavior when pumping compressed data on tape whether physical or virtual tapes.

In my case, I use Data Domain for the VTL portion, and the same behavior exists. Do I get worried? Heck no. This is not part of what I should worry about as systems guy. What I worry about is "Is my data being backed up and recoverable". I leave the mechanics to the hardware.
 
PREDATAR Control23

De data sources are plain files using the TSM BA client, without compression or Virtual machines using TSM for VE, also without compression.

And I agree that I should not worry about the mechanics of the hardware.. But the question "is my data being backed up" does worry me as I regularly run out of scratch tapes, causing backups to fail while the VTL still reports more than 10TB of free usable (nominal) space.. And then I see a thousand tapes marked as full with only 20GB to 40GB in use.
I then always have to temporary move a few tapes out of the way to a physical library, start moving those 20GB tapes within the library to scratch tapes and then I always end up with about 200/300 scratch tapes.. just by moving around the data on the VTL..

The solution would of course be to extend the capacity of the VTL .. so that I don't have to worry about wasted space anymore.. but we currently don't have resources to buy extra disks and capacity licenses..
And there is 10TB of free space !! so I should be able to use it, and the system should not start wasting space all around the place..
All this while the aspect ratio is stable and the amount of defined tapes equals to the nominal total space..
 
PREDATAR Control23

The next question to ask then: Is expiration and reclamation running properly? Is retention too long?

If the system is sized right, then you should not see spikes in tape usage but see a slow increase as you bring in more data.

Granting that dedup is working properly, sizing and capacity problems should not be a "next day critical" issue.
 
Last edited:
PREDATAR Control23

The next question to ask then: Is expiration and reclamation running properly? Is retention to long?
Expiration and reclamation is the cause of our move from 300Gb to 50Gb tapes.. as the reclamation threshold had to be set quite low so that there would be enough free space by the time the next backup started, but this reclamation process also took a lot of time causing that we regularly didn't have enough free space.. Using 50Gb cartridges it should optimize the reclamation processes.. but we still have to tune the threshold and find out what would be the ideal setting..

But to be able to start tuning those parameters, I first wanted to have all data on 50Gb tapes... And the first hundred tapes are correctly moved to 50GB tapes but as this is a lengthy process I automated it using a script, and by the next day this process is finished but I always find again a lot of other tapes with sizes of 40 to 20Gb ..
 
PREDATAR Control23

I'm having the exact same issue. Even worse, we set the maximum size 400GB in ProtecTIER, yet it gets full variably at 10GB, 60GB, 70GB, ... on TSM. This didn't happen until a while after we started to use it. So I don't think it's compressed data that causes this, we had backed up compressed data back then too before we had this issue.

I'm thinking it may have something to do with fragmentation and reclaimation process between TSM and ProtecTIER, but then fragmented data isn't as big as the gab :(

any more theories or explanaitons?
 
PREDATAR Control23

I'm having the exact same issue. Even worse, we set the maximum size 400GB in ProtecTIER, yet it gets full variably at 10GB, 60GB, 70GB, ... on TSM. This didn't happen until a while after we started to use it. So I don't think it's compressed data that causes this, we had backed up compressed data back then too before we had this issue.

I'm thinking it may have something to do with fragmentation and reclaimation process between TSM and ProtecTIER, but then fragmented data isn't as big as the gab :(

any more theories or explanaitons?

As LexRogoff said, TSM has no idea what the size of the volume is. This is an issue within PT - probably configuration or administration related. You should really take a look at the "Managing capacity fluctuations" section in the ProtecTier redbook as Lex suggested.
 
PREDATAR Control23

hello guys, please i need this:
a tape that is 100% full, or less but has not been used for a long time, can i move data from it and check it out to create slot space? i have some tape when i q libvol its not labled data and the rest are in private? what does that mean. thanks ,cant wait to hear from you all please. I need scratch slot without issues
 
PREDATAR Control23

a tape that is 100% full, or less but has not been used for a long time, can i move data from it and check it out to create slot space?
Maybe, it depends. You will need a scratch tape or enough space on the rest of the FILLING tapes.
i have some tape when i q libvol its not labled data and the rest are in private? what does that mean.
Tapes that are not used in a storage pool will not show data. These tapes could be: database backup, database snapshot, export or backupset. You can use "query volhist" to see what they were used for last.
I need scratch slot without issues
Is your reclamation running? What threshold are you using? Have you tried using a more agressive reclamation threshold? When it comes a point like this where you are running out of slots, you get limited with options:
- first, try running reclamation more agressively
- second, review your retention policies and negotiate shorter retention with the data owners
- third, if both of the above fail, ask management for an upgrade to the library

Usually comes down to the 2nd or 3rd option. Either store less data or get more storage.
 
PREDATAR Control23

Thanks for your quick responses,our threshold is 60%. when you say AGRESSIVE reclamation? do you mean run reclamation again after Tivoli has ran its own????
 
PREDATAR Control23

Thanks for your quick responses,our threshold is 60%. when you say AGRESSIVE reclamation? do you mean run reclamation again after Tivoli has ran its own????
No, I mean running with a lower number. At 60, this means that a tape must have 60% or more reclaimable space before it is reclaimed. In most environment, 50% is generally a good number, because it means that two half tapes makes 1 full tape and 1 scratch tape. But if you are tighter on space, then you need to reclaim volumes with less than 50% free.

Since you are at 60%, try to run the reclamation again at 50%. If you free up enough tapes for your needs, good, if not try again at 40%.

At 33%, you could potentially take 3 tapes with 33% reclaimable and end up with 2 full tapes and one scratch.
 
PREDATAR Control23

Hi Marclant, my tapepool Relamation is: 60%, as you know before ( q stgpool tapepool f=d) but I was going through the RECLAMATION script and found this:
Name Line Command Last Update by Last Upda-
Number (administrator) te Date/T-
ime
------------------------------ ------ ---------------------------------------------------------------------------------------------------- --------------- ----------
DAILY_EXP_RECLM Descr- Daily Experation and Reclaimation Sirius 03/12/2013 ADMIN 03/13/2013
ipti- 13:54:38
on
DAILY_EXP_RECLM 7 /*Reclamation Script */ ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 12 /*reclamation*/ ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 17 parallel ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 22 reclaim stgpool TAPEPOOL threshold=75 wait=yes ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 27 reclaim stgpool PTAPEPOOL threshold=80 wait=yes ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 32 serial ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 37 parallel ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 42 reclaim stgpool COPYPOOL threshold=80 offsitereclaiml=10 ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 47 reclaim stgpool PCOPYPOOL threshold=70 offsitereclaiml=10 ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 52 reclaim stgpool PACSCOPOOL threshold=70 offsitereclaiml=10 ADMIN 06/25/2015
23:21:01
DAILY_EXP_RECLM 57 serial ADMIN 06/25/2015
23:21:01
does that mean my threshold is really 75% on the tapepool? If so I should recommend we reduce right? or what the best thing after going though the Q SCR DAILY_EXP_RECLM
 
PREDATAR Control23

does that mean my threshold is really 75% on the tapepool?
Well, it's both 60% and 75%. Reclamation can start one of two ways:
- automatically by a thread that wakes up every 60 minutes and checks if there are any tapes with a reclaimpct higher than the reclaim threshold of the storage pool, so in this case, 60
- automatically by a script and used the value in the script and in this case, 75%

Typically, if you use the latter, you set the reclaim threshold at 100% for the storage pool, so that reclamation only starts when the admin schedule runs it. That's the preferred method because you can control when reclamation runs. And in your case, you'd update the script: DAILY_EXP_RECLM so that the reclamation threshold is 50%. 50% is an ideal number, but if you need to free up even more tapes, then go lower. I'd try 50 first. Might not be a bad idea to do the same for other pools too if you are running low on scratch.

You can check how many volumes are available for reclamation using:
Code:
select count(volume_name) from volumes where pct_reclaim>50 and stgpool_name='TAPEPOOL'
You can play with the pct_reclaim to see how it changes.

Or to get a list of the volumes:
Code:
select volume_name,pct_reclaim from volumes where pct_reclaim>50 and stgpool_name='TAPEPOOL'
 
PREDATAR Control23

really I still at the rate which im using tapes, LTO :
Storage Device Estimated Pct Pct High Low Next Stora-
Pool Name Class Name Capacity Util Migr Mig Mig ge Pool
Pct Pct
----------- ---------- ---------- ----- ----- ---- --- -----------
COPYPOOL LTOECLASS 1,407,419 40.0
G
EMCDBPOOL DISK 975 G 42.1 42.1 90 70 PTAPEPOOL
EMCGENPOOL DISK 3,300 G 0.0 0.0 95 70 TAPEPOOL
EMCPACSPOOL DISK 560 G 0.0 0.0 90 70 PACSTAPOOL
PACSCOPOOL LTOECLASS 1,377,343 1.9
G
PACSTAPOOL LTO 115,056 G 22.6 24.0 90 70
PCOPYPOOL LTOECLASS 1,614,609 13.4
G
PTAPEPOOL LTO 1,783,449 12.2 12.3 90 70
G
TAPEPOOL LTO 714,619 G 78.8 90.2 90 70
out of all the storage pools only TAPEPOOL is collocating, what will happen if I remove or delete the collocation cos we don't recover , will that action reduce the mount of tapes I am presently using?
 
Top