ADSM-L

Re: [ADSM-L] Deduplication/replication options

2013-07-23 18:32:59
Subject: Re: [ADSM-L] Deduplication/replication options
From: Nick Laflamme <nick AT LAFLAMME DOT US>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 23 Jul 2013 17:30:57 -0500
I'm surprised by Allen's comments, given the context of the list.

TSM doesn't support BOOST. It doesn't support at the server level, and it
doesn't support for a client writing directly to a DataDomain DDR. This may
be obvious to everyone, but I fear for the people who are TSM-centric and
haven't gotten to the point of bypassing TSM in some instances.

Also, BOOST is feature with a price, just like the VTL support is and
replication is.  I'm not saying that's bad, only that you have to factor
that in.

As for us, we don't use copy storage pools with our DataDomains; we make
sure our TSM servers use replicated disk and write to replicated storage,
and if we ever lose our primary data center, we'll restore at our DR site
pick up with the replicated DDR storage. As others have noted, this leaves
us vulnerable to any corruption due to DDR crashes or server crashes that
confuse the DDR, but management signed off on that. We briefly experimented
with running copy pools on the same DDR to have diversity in how data was
arranged, but the growth in the size of our TSM databases and a
surprisingly poor dedupe rate for a second copy on the same DDR doomed that
initiative.

Nick



On Tue, Jul 23, 2013 at 3:12 PM, Allen S. Rout <asr AT ufl DOT edu> wrote:

> On 07/23/2013 01:19 PM, Sergio O. Fuentes wrote:
> >
> > We're currently faced with a decision go with a dedupe storage array
> > or with TSM dedupe for our backup storage targets.  There are some
> > very critical pros and cons going with one or the other.  For
> > example, TSM dedupe will reduce overall network throughput both for
> > backups and replication (source-side dedupe would be used).  A dedupe
> > storage array won't do that for backup,
>
>
> Not so.  There's a driver-ish package from EMC, associated with the
> Data Domain product line, called "boost".  Boost shoves dedupe work
> from the central device out to the client box, distributing CPU work
> and saving network traffic.   There may be other similar offerings,
> but Data Domain is what we've got, so it's what I know.
>
> We're not using boost;  our primary use for the DD is for Oracle
> backups, and our DBAs are far more interested in the conventional
> filesystem user interface than they are in the network savings.   But
> if you find the bandwidth between client and device to be a serious
> bottleneck, there's an option.
>
>
> > Replication is key. We have two datacenters where I would love it if
> > TSM replication could be used in order to quickly (still manually,
> > though) activate the replication server for production if necessary.
> > Having a dedupe storage array kind of removes that option, unless we
> > want to replicate the whole rehydrated backup data via TSM.
>
> I intend to go the same direction you are intending to go.   But I'm
> not there yet.  I hope to have some results on this before September.
>
>
> > Would it make sense to do a hybrid deployment (combination of TSM
> > Dedupe and Array dedupe)?  Any thoughts or tales of woes and
> > forewarnings are appreciated.
>
> Only thoughts, not tales yet.  But I'm planning to experiment with
> dedupe both at the TSM level and at the storage array level.   I've
> heard several rumors that the Data Domain can dedupe even deduped
> e.g. VEEAM backups, with very good ratios.   I'm going to try a
> similar theory with the DD and TSM-deduped stgpools.
>
>
> - Allen S. Rout
>