Re: [ADSM-L] tsm and data domain

On Jun 16, 2011, at 8:34 PM, Paul Zarnowski wrote:

> At 05:59 PM 6/16/2011, Nick Laflamme wrote:
>> We need to do a bake-off -- or study someone else's -- between using 
>> deduplication in a DataDomain box and using both client-side deduplication 
>> and server-side deduplication in TSM V6 and then writing to relatively 
>> inexpensive, relatively simple (but replicating) storage arrays. However, we 
>> keep pushing the limits of stability with our TSM V6 servers, so we haven't 
>> dared tried such a back-off yet. 
> 
> Nick,
> 
> We are heading down this path.  My analysis is that in a TSM environment, the 
> fairly low dedup ratio does not justify the higher price of duduping VTLs.  
> Commodity disk arrays have gotten very inexpensive.  We're using DS3500s 
> which are nice building blocks.  We put some behind IBM SVCs for servers, 
> some attached to TSM or Exchange servers (without SVC).  Common technology - 
> different uses.  We use them for both TSM DB, LOG and FILE (different RPM 
> disks, obviously).  Using cheap disk vs VTLs has different pros and cons.  
> using disk allows for source-mode (client-side) dedup, which a VTL will not 
> do.  VTLs, on the other hand, allow for global dedup pools and LAN-free 
> virtual tape targets.  deduping VTLs will be more effective in TSM 
> environments where you have known duplicate data, such as lots of Oracle or 
> MSSQL full backups, or other cases where you have multiple full backups.  For 
> normal progressive incremental file backups, however, TSM already does a good 
> job of reducing data so VTL dedup doesn't get you as much, and in this case 
> IMHO cheap disk is, well, cheaper and gets you source-mode dedup as well.
> 
> We are in process of implementing this, but I know a few others are a bit 
> further along.

It's too bad there isn't a USA-based users group for TSM that meets annually or 
even semi-annually for things like user presentations and panels on topics like 
this. :-) I'd love to go to a "TSM Workshop" at some university campus to geek 
out on topics like this. 

Dedupe ratios are all over the place for us. We've got some in the high teens, 
and I already mentioned the low end, low single-digits. Part of me wishes we'd 
broken our library volumes out into smaller replication pools (and 
corresponding TSM library pools) so we could get a little more granularity on 
dedupe ratios, but what I really want is volume-by-volume (or file-by-file for 
NFS mounts? Directory-by-directory?) dedupe ratios. 

Adding a copy storage pool on the same DDR is cheap from a DDR point of view; 
it seems to dedupe great when I do that. The TSM DB size becomes the limiting 
factor in that case. 

> We will continue to use TSM Backup Stgpool to replicate offsite.

We're replicating at the HW level; TSM doesn't know it. That makes me a little 
nervous.

I forgot one thing I don't like: having specific "cleaning" cycles on the DDR 
is annoying. We run it two or three times a week on a couple of our DDRs to 
keep them from getting too high (above 80%) in utilization, but I wish it just 
constantly did its own garbage collection. 

> ..Paul

Nick