Re: [ADSM-L] Data Deduplication

2007-08-29 16:33:03
Subject: Re: [ADSM-L] Data Deduplication
From: Charles A Hart <charles_hart AT UHC DOT COM>
Date: Wed, 29 Aug 2007 15:31:13 -0500
You're correct, in that there are products that can provide a more global
repository.  We used the Dilligent VTFOpen in a 2node cluster and achieved
a 1200MBS write speed! Impressive, so if you don't need the de-dup the
VtfOpen product really screams.

In one of a few large Data Centers we see 25TB per night (FS Incr and Full
DB Backups) the Clustering feature that Diligent is working is huge for
us, but we will not be the first on to bleed from it as we've already shed
some blood, but you have to expect that when you are working with new
technologies.  I think the cluster feature for ProtectTier is first
quarter 08, but its been a moving target for a year now.

Many of you hear me speak to Diligent's product, we do have some older EMC
CDL's and one Data Domain @ a remote.  We did an RFP for Virtual Tape
solutions 18months ago and landed on the Diligent Protectier because it
was the only de-dupe VTL head that was accessible via fiber channel.
Correct me if I'm wrong but DataDomain uses IP NFS mount to access the
repository which just wouldn't scale in our environment, nothing against
any of the other VTL / De-Dupers... It will be interesting to how the "Out
of Band" de-duping VTL (Falconstor) pan out.  Its either pay for the
performance hit of de-duping up from or touching the data twice.

Charles Hart

Curtis Preston <cpreston AT GLASSHOUSE DOT COM>
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
08/29/2007 02:07 PM
Please respond to


Re: [ADSM-L] Data Deduplication

>As de-dup, from what I have read, compares across all files
>on a "system" (server, disk storage or whatever), it seems
>to me that this will be an enormous resource hog

Exactly.  To make sure everyone understands, the "system," is the
intelligent disk target, not a host you're backing up.  A de-dupe
IDT/VTL is able to de-dupe anything against anything else that's been
sent to it.  This can include, for example, a file in a filesystem and
the same file inside an Exchange Sent Items folder.

>The de-dup technology only compares / looks at the files with in its
>specific repository.  Example: We have 8 Protectier node in one data
>center which equtes to 8 Virtual Tape Libraries and 8   reposoitires.

There are VTL/IDT vendors that offer a multi-head approach to
de-duplication.  As you need more throughput, you buy more heads, and
all heads are part of one large appliance that uses a single global
de-dupe database.  That way you don't have to point worry about which
backups go to which heads.  Diligent's VTL Open is a multi-headed VTL,
but ProtecTier is not -- yet.  I would ask them their plans for that.

While this feature is not required for many shops, I think it's a very
important feature for large shops.

This e-mail, including attachments, may include confidential and/or
proprietary information, and may be used only by the person or entity to
which it is addressed. If the reader of this e-mail is not the intended
recipient or his or her authorized agent, the reader is hereby notified
that any dissemination, distribution or copying of this e-mail is
prohibited. If you have received this e-mail in error, please notify the
sender by replying to this message and delete this e-mail immediately.

<Prev in Thread] Current Thread [Next in Thread>