Re: [ADSM-L] Data Deduplication

2007-08-29 15:09:52
Subject: Re: [ADSM-L] Data Deduplication
From: Curtis Preston <cpreston AT GLASSHOUSE DOT COM>
Date: Wed, 29 Aug 2007 15:07:46 -0400
>As de-dup, from what I have read, compares across all files
>on a "system" (server, disk storage or whatever), it seems
>to me that this will be an enormous resource hog

Exactly.  To make sure everyone understands, the "system," is the
intelligent disk target, not a host you're backing up.  A de-dupe
IDT/VTL is able to de-dupe anything against anything else that's been
sent to it.  This can include, for example, a file in a filesystem and
the same file inside an Exchange Sent Items folder.

>The de-dup technology only compares / looks at the files with in its
>specific repository.  Example: We have 8 Protectier node in one data
>center which equtes to 8 Virtual Tape Libraries and 8   reposoitires.

There are VTL/IDT vendors that offer a multi-head approach to
de-duplication.  As you need more throughput, you buy more heads, and
all heads are part of one large appliance that uses a single global
de-dupe database.  That way you don't have to point worry about which
backups go to which heads.  Diligent's VTL Open is a multi-headed VTL,
but ProtecTier is not -- yet.  I would ask them their plans for that.

While this feature is not required for many shops, I think it's a very
important feature for large shops.

