NetApp deduplication

stephrf

Active Newcomer
Joined
Nov 20, 2003
Messages
30
Reaction score
0
Points
0
Location
UK
Hi, is anyone using Netapp FAS to do the deduplication for TSM storage pools?

This is more from a management aspect:

1. are you using "thin provisioning" to fool both TSM and the OS that it has more space?

2. or do you allocate 2 TB to a stgpool (eg file) and then increase the maxscratch to use freed up space?
Or, won't this work?

I guess either is change in the way management stgpools is done.

Any thoughts, ideas would be appreciated.

thanks Rob
 
I did some quick research on this out of curiosity. There isn't much out there and it doesn't sound like a recommended approach. See below.

8 DEDUPLICATION AND TIVOLI STORAGE MANAGER (TSM)
If Tivoli Storage Manager (TSM) and NetApp deduplication for FAS will be used together, the following should be taken into consideration: Deduplication savings with TSM will not be optimal due to the fact that TSM does not block align data when it writes files out to its volumes. The net result is that there are less duplicate blocks available to deduplicate. TSM compresses files backed up from clients to preserve bandwidth. Compressed data does not usually yield good savings when deduplicated. TSM client-based encryption will result in data with no duplicates. Encrypted data does not usually yield good savings when deduplicated. TSM’s progressive backup methodology backs up only new or changed files, which reduces the number of duplicates, since there are not multiple full backups to consider.

Source: Netapp deduplication for FAS http://contourds.com/uploads/file/tr-3505.pdf

As such I would personally suggest utilizing TSM's native deduplication if possible.
 
Jonathan, thanks for your advice. I did see good dedupe / compression savings for certain data types. However, unless the data could be kept on the on the FAS in terms of primary and copy pools the time overhead in migration to tape was too slow: rehydration I guess. That coupled with the management headache meant we decided not to use it for this TSM instance.
However, I am planning to use it for another TSM instance where we will have enough space to keep all the data on the FAS. I'll let you know if the savings make it worthwhile.
regards Rob
 
However, unless the data could be kept on the on the FAS in terms of primary and copy pools the time overhead in migration to tape was too slow: rehydration I guess.

I kind of doubt that. Using the filer's deduplication means, that TSM knows nothing about it, right. Copying or migrating data to a tape pool (via TSM that is) for example would indeed require a form of rehydration - but on a storage level, where it's just reading different pointers from the disks. There is a hit on performance, but it decreases even sequential reads not in a manner where it would cause trouble. You can -easily- max out several LTO-5 streams even with a small SATA-only filer using deduped data.

And regarding the use of a NA filer als storage for TSM - absolutely fine. I'd suggest using CIFS however, that coupled with a 10GbE connection is both easy to use and manage and performs pretty decently, no iSCSI or FC required. Thin provisioning is standard for us anyways ...

Deduplication and compression are features TSM does have, yes - but I much prefer it right down on the storage level for TSM usage.
 
Hello

I have primary storagepool on fas i know that you should not enable tsm deduplication and netapp deduplication at the same time ok.

But in my case i don't use TSM deduplication so it's possible to just enable NettApp deduplication ? it's work ?

Best
 
I have been using Data Domain for back end storage for over 9 years now and the setting is set to TSM no compression so Data Domain does the de-duplication.

If you set the de-duplication on the TSM side, you will end up using more space (with Data Domain) on your back end storage. Netapp, which we also have, does not de-duplicate as much but still has a little edge over native TSM de-duplication.

Do not turn ON back end storage de-duplication if TSM de-duplication is ON. This is a waste of processing cycles and the back end will not de-duplicate any further. Least to say, been there and had done it.

Personally, if one is certain to use native TSM de-deduplication, I would not go for expensive back end storage like Netapp or Data Domain. I will just buy high reliability, replicating and cluster-aware JBOD. SAN arrays are possible candidates.
 
Last edited:
I know what you mean moon but I do not choose my infrastructure :-(

After done several tests, the inline deduplication of netapp works very badly with a storagepool tsm.

The deduplication nettapp is a schedule and it needs the volume not in use

I also have a datadomain, even with the best tuning is very slow because the speed of each stream is limited with datadomain, moreover you can not make instant restore with vmware because it is too slow for me

I think source dedup with tsm is the best, the object storage appliances are not bad either if you use storage contenair
 
Back
Top