Tape compression rate

vegnar

ADSM.ORG Member
Joined
May 25, 2006
Messages
24
Reaction score
0
Points
0
We are planning to change our primary storagepool from LTO3 tape to Disk (SATA-SAN). Our primary storagepool have today a size of 18TB. My problem is to find the uncompressed size of data used on the primary storagepool, because that will be the needed space on Disk. What is the effective compression rate on a LTO3 drive when the backup client is compressing when backing up? Are there any scripts that can tell me the real size of my primary storagepool?
 
I don't believe there is a direct way of getting the uncompressed size. LTO tape drive manufacturers have different compression ratios and is not always a given constant. Much of the ratio is dependent on the data being backed up. You will see compression figures like 2.6 to 1, 3.0 to 1 and so on.

If I may ask, why go into a disk type pool for your storage? This a more expensive way and mind you, your retention options (versions of a file, how long to keep the inactive files, etc.) aree very limited and can be very, very expensive as you try to keep data longer, or keep more inactive versions.

And by the way, what about for disaster recovery? Are you still keeping tapes? I hope so.
 
You can easily check the uncompressed amount of data by either looking crudely at the actually reached volume capacity (q vol - and look for "full" tapes with no reclaimable data) or by issuing something like "select sum(physical_mb) from occupancy where stgpool_name in ('my','tape','pools')"
You might just as well replace physical_mb with logical_mb. There shouldn't be a lot of difference and if there is, you can assume your real capacity is closer to logical, since you'll reclaim (reconstruct transaction groups) your file based pools more often than tapes.
You may also consider compression on either filesystem level or on client level when using sata. I used to condemn client compression as a thing from the devil because it nearly killed us years ago. But with our current client hardware, performance impact of compression is barely noticable at all.

PJ
 
Sorry, moonbuddy. I didn't see your reply so please don't take offense with my wording. Truth is, q vol will reveal the amount of uncompressed data as seen by the server - same with occupancy.

PJ
 
We are going to use the library as copy storagepool for disaster recovery.
Full tapes which are 100% utilized have an estimated capasity of about 379-490 GB. This means that the compression rate is about 2:1?
 
If they are LTO3 tapes (400GB uncompressed), you don't appear to have compression at all or you already compress data on the client. Run the select I posted earlier. It will directly give you the minimal capacity required for you sata file pool. Include all the tape pools you want to replace in the ('pool','other_pool') directive.

PJ
 
PJ, Vegnar,

I would take caution in using figures like 2:1, as an example, to calculate disk to tape ratios. Disk stores data very differently than tape, and for that matter, a 100 gigabyte data on tape may not translate to exactly 100 gigabyte on disk.
 
If running client compression you won't get much if any compression on the tape drive. I think in your first post you seemed to say you used client compression.
 
Thanks for quick responce.
When I run the select sum(physical_mb)... I get 17,2GB as result. If I run select sum(logical_mb)... the result is 17.1 GB. I guess this means that there are almost no compression, most likely because of compression on client.
I would expect the physical value to be smaller than the logical...
 
The difference between logical and physical occupancy comes down to the way TSM has broken up transaction groups and does not have to do anything with compression. 17 TB in the occupancy means you will have to have at least that amount of space for sequential file to store your data, if all volumes are full an you keep rediculously low reclaim thresholds. I'd say you need at least 20 TB to substitute your current tape. Think of a combined strategy, where you keep only certain nodes on disk or give a size-limit to the pools you keep there. (Or just add more disk)

PJ
 
Back
Top