VTL / deduplication / server capabilities

Fattire

ADSM.ORG Member
Joined
Jun 5, 2006
Messages
210
Reaction score
0
Points
0
currently running TSM 5.5, windows 2003, 8gig ram, aprox. 800 gig hard drive. 3583 LTO2 library. boss would like to upgrade to TSM 6.1, replace tape library with VTL, and dedupe.
We currently backup 50 nodes per night, with total avg. around 300gig a night. He asked me if we would need to set up another TSM instance(?) to handle the dedupe processing or if the one server can handle it all. I'm not sure. I haven't seen it discussed here. boss wants to go with VTL because of the DR capabilities.
I guess I'm wanting to know what my options are.
thanks for any advice or links.
john.m
 
Well if you want to use the dedup option from TSM... Forget the VTL.. Dedup is only done with a file devclass. and VTL emulate tapes.

You don't need to add another instance only for dedup for a 300gb backup night with 50 nodes... if you have a good server and it's tune correcly..

DR capabilities can easily be done using a filedisk devclass if you have them on both sites... same as VTL'S.. and it might be cheeper to use low cost SATA than a VTL..
 
In the admin guide for 6.1 on Data dedupe it talks about deduping volume and reclaiming volumes so I was under the impression it had to have virtual tape volumes.

A
volume does not have to be full before duplicate identification starts. In the second
phase, duplicate data is removed by any of the following processes:
v Reclaiming volumes in the primary-storage pool, copy-storage pool, or
active-data pool
v Backing up a primary-storage pool to a copy-storage pool that is also set up for
deduplication
v Copying active data in the primary-storage pool to an active-data pool that is
also set up for deduplication
v Migrating data from the primary-storage pool to another primary-storage pool
that is also set up for deduplication
v Moving data from the primary-storage pool to a different primary-storage pool
that is also set up for deduplication, moving data within the same copy-storage
pool, or moving data within the same active-data pool
 
Data deduplication is a method of eliminating redundant data in sequential-access disk (FILE) primary, copy, and active-data storage pools.One unique instance of the data is retained on storage media, and redundant data is replaced with a pointer to the unique data copy. The goal of deduplication is to reduce the overall amount of time that is required to retrieve data by letting you store more data on disk, rather than on tape.

http://publib.boulder.ibm.com/infocenter/tsminfo/v6/index.jsp

Look under the DEFINE STGPOOL section.
>--+-----------------------------+------------------------------>
'-DEDUPlicate--=--+-No------+-'
| (5) |
'-Yes-----'

Notes:
  1. This parameter is not available for storage pools that use the data formats NETAPPDUMP, CELERRADUMP, or NDMPDUMP.
  2. This parameter is not available or is ignored for Centera storage pools.
  3. The RECLAMATIONTYPE=SNAPLOCK setting is valid only for storage pools defined to servers that are enabled for System Storage™ Archive Manager. The storage pool must be assigned to a FILE device class, and the directories specified in the device class must be NetApp SnapLock volumes.
  4. The values NETAPPDUMP, CELERRADUMP, and NDMPDUMP are not valid for storage pools defined with a FILE-type device class.
  5. This parameter is valid only for storage pools that are defined with a FILE-type device class.
  6. This parameter is available only when the value of the DEDUPLICATE parameter is YES.

That was the biggest downer for me.. No Dedup on LTO class so no dedup on the VTL.

THE_WIPET
N.B: Today is Guinness 250th birthday. Let all raise our guinness pinte and say "To Arthur"
 
So one would need to keep all the backed up deduped data on disk? and eliminate tapes and virtual tapes? If i wanted to move the data to tape or virtual tape then it would
un-dedupe? Am I reading that correctly?
 
Yes... you read correctly..

And to ad... at this moment.. i think the only software that keep the dedup from disk to tape is Comvault Sympana 8.

All the reste have to "Un-dedup, rehydrate" the tapes
 
Last edited:
Why not just get a VTL with dedupe capabilities then? Maybe the cost is an issue which I could understand.
 
Word on the street is that dedupe is only good for about a 10% reduction.

Instead of focusing on dedup we've started using client side compression for our Sata stgpools. We've had very good results.
 
not sure what street but we're getting great compression with hardware and are testing dedupe on top of that for an expected 10x gain!
 
Deduplication Rates

We are currently using EMC DL3000. Here are the following deduplication rates. I have also tested the HP VLS 9000 and Data Domain and Avamar. Many differences. Dont worry so much about deduplication rates because you should get about 7-10x on TSM but depends on retention. We run 90 day retention. We have only had it about 1-2 months so far. It works just fine. Also the DL3000 can run as a VTL or NAS box or both at same time.

We have 2 VTL's.

VTL00: It does Oracle and DB2
VTL01: It does Avepoint (Sharepoint), SQL and OS

VTL00: Duplication rate is 5.55x
VTL01: Duplication Rate is 8.00x

We replication to opposite data center and backup between 7-15 TB per night total.

As far as the rehyrate it depends on the software. Data Domain rehydrate is faster because it is CPU and Memory based so it only writes and reads deduplicated data. This is how they can achieve the same write and read rates.
 
Back
Top