Re: [ADSM-L] Data Deduplication

2007-08-30 03:11:16
Subject: Re: [ADSM-L] Data Deduplication
From: Curtis Preston <cpreston AT GLASSHOUSE DOT COM>
Date: Thu, 30 Aug 2007 03:09:09 -0400
When you say "losing TSM opps to de-dupe vendors," you must be talking
about de-dupe SOFTWARE vendors (Avamar, Puredisk, Asigra).  I don't see
how someone buying a de-dupe VTL to go with TSM would be considered a
lost TSM opportunity.

Unlike a de-dupe VTL that can be used with TSM, de-dupe backup software
would replace TSM (or NBU, NW, etc) where it's used.  De-dupe backup
software takes TSM's progressive incremental much farther, only backing
up new blocks/fragements/pieces of data that have never been seen by the
backup server.  This makes de-dupe backup software really great at
backing up remote offices.

The alternative is to put a complete backup infrastructure (server,
tape, disk, etc) at the remote site and have someone swap tapes out
there.  That's been the only answer for years.  Now, de-dupe backup
software allows you to back up relatively large remote offices with NO
backup infrastructure at the remote site.  That's nothing short of huge.

I know of a major trading firm, for example, that is now backing up
almost 300 remote sites to their central datacenter without putting any
backup infrastructure in any of them.  Since a 48 hour RTO was fine for
their remote offices, they do restores locally in the central datacenter
and Fed Ex the restored systems/drives to the remote office.

W. Curtis Preston
Backup Blog @
VP Data Protection, GlassHouse Technologies 

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Kelly Lipp
Sent: Wednesday, August 29, 2007 12:41 PM
Subject: Re: [ADSM-L] Data Deduplication

I'd like to steer this around a bit.  Our sales folks are saying they
are losing TSM opportunities to de-dup vendors.  What specific business
problem are customers trying to solve with de-dup?

I'm thinking the following:

1. Reduce the amount of disk/tape required to storage backups.
Especially important for all an all disk backup solution.
2. Reduce backup times (for source de-dup I would think.  No benefit in
target de-dup for this).
3. Replication of backup data across a wide area network.  Obviously if
you have less stored you have less to replicate.

Others?  Relative importance of these?

Does TSM in and of itself provide similar benefits in its natural state?
>From this discussion adding de-dup at the backend does not necessarily
provide much though it does for the other traditional backup products.
Since we don't dup, we don't need to de-dup.

Help me get it because aside from the typical "I gotta have it because
the trade rags tell me I gotta have it", I don't get it!

Thanks, (Once again not afraid to expose my vast pool of ignorance...) 

Kelly J. Lipp
VP Manufacturing & CTO
STORServer, Inc.
485-B Elkton Drive
Colorado Springs, CO 80907
lipp AT storserver DOT com

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Curtis Preston
Sent: Wednesday, August 29, 2007 1:08 PM
Subject: Re: [ADSM-L] Data Deduplication

>As de-dup, from what I have read, compares across all files on a 
>"system" (server, disk storage or whatever), it seems to me that this 
>will be an enormous resource hog

Exactly.  To make sure everyone understands, the "system," is the
intelligent disk target, not a host you're backing up.  A de-dupe
IDT/VTL is able to de-dupe anything against anything else that's been
sent to it.  This can include, for example, a file in a filesystem and
the same file inside an Exchange Sent Items folder.

>The de-dup technology only compares / looks at the files with in its 
>specific repository.  Example: We have 8 Protectier node in one data
>center which equtes to 8 Virtual Tape Libraries and 8   reposoitires.

There are VTL/IDT vendors that offer a multi-head approach to
de-duplication.  As you need more throughput, you buy more heads, and
all heads are part of one large appliance that uses a single global
de-dupe database.  That way you don't have to point worry about which
backups go to which heads.  Diligent's VTL Open is a multi-headed VTL,
but ProtecTier is not -- yet.  I would ask them their plans for that.

While this feature is not required for many shops, I think it's a very
important feature for large shops.

<Prev in Thread] Current Thread [Next in Thread>