ADSM-L

Re: [ADSM-L] Dedupe

2009-06-25 09:09:44
Subject: Re: [ADSM-L] Dedupe
From: "Strand, Neil B." <NBStrand AT LMUS.LEGGMASON DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 25 Jun 2009 09:09:09 -0400
Ditto on Lindsay's "it depends"

For my NetApp devices, observed NAS filesystem dedupe renges from 10% to
70% depending on the data.
VMware NFS shares typically show a good ratio.  We for our VM
environment, we split our OS apart from data and paging space as
depicted below:
Filesystem                      used            saved
%saved
/vol/PROD_VM_OS/        98314436        227793716         70%
/vol/PROD_VM_PAGING/    3107084         1090756                 26%
/vol/PROD_VM_DATA1/     11253900        17343096          61%
/vol/DR_VM_OS1/                 105852808       236518940         69%
/vol/DR_VM_DATA1/       431134632       216285060         33%
/vol/DR_VM_PAGING1/     35520       4272                11%

The paging space is very dynamic and I don't expect much savings.
The OS space (where VM operating systems are installed) is relatively
static and redundant and reflects that with high dedup ratios.
The data space (where applications and everything else is) has a wide
variance - as expected.

But the end result is that I am saving disk space and actually improving
overall performance because redundant data has a higher probability of
residing in cache and the reference to a particular bit of redundant
data has a higher probability of residing in the cached lookup table.

If you are looking for dedupe on tape media, I don't think it is
feasable nor desired.  Simple compression now allows me to put nearly
3TB on a single 3592 tape (again depending on the data).  At a nominal
cost of $150/tape this results in about 5 cents/GB.  Not too shabby.  I
make a second offsite copy of the same data resulting in an overall cost
of 10 cents to provide +"five nines" probability that my company's data
is recoverable for the next 6 years.  This is less than the cost of
electricity for disk based storage for the same time period.

Dedupe has it's place as do most technologies.  It is not a golden egg
unless you force it to be ... and then, when it hatches, it may be a
fine goose or it may be a platypus - it depends on your environment.


Cheers,
Neil Strand
Storage Engineer - Legg Mason
Baltimore, MD.
(410) 580-7491
Whatever you can do or believe you can, begin it.
Boldness has genius, power and magic.


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Ochs, Duane
Sent: Thursday, June 25, 2009 7:35 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Dedupe

For common practice de-dup is not a tape oriented process. It is usually
to reduce data on disks.
One concern would be the amount of tape mounts required to restore data
in the event of a DR scenario.
As the article has stated there are not many "global" de-dup products
yet. We have been able to implement some dedup on specific applications,
for instance E-mail attachments and it has worked out fairly well.
However, it primarily was to reduce the size of the Storage Groups of
our Exchange cluster, in the event of a DR scenario, which is on tier 1
storage. And the de-dupped attachments are now on tier 2. It reduced our
SGs by 1/3. The exchange SGs backups are retained based on legal
requirements and replicated. The attachments are not.

I also tested Data Domain and was very unimpressed by the numbers I saw.
It had very little impact on our largest amounts of data. Imaging,
Exchange and DB dumps. But that is also the hardest type of data to
de-dup.
My two cents.


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
madunix
Sent: Wednesday, June 24, 2009 11:37 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Dedupe

However, for my thoughts of Dedupe it could be interesting for those who
need to decrease the number of tape cartridges, but they could suffer
signifigannt CPU and I/O spec. for dedupe processing, and one issue i
was thinking about is a fauiler or if one part is corrupted, i.e. many
files would be affected by loss of common chunk, and what about
encryption is it compatible with encryption.

Thanks
madunix

>> -----Original Message-----
>> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf

>> Of lindsay morris
>> Sent: Wednesday, June 24, 2009 1:07 PM
>> To: ADSM-L AT VM.MARIST DOT EDU
>> Subject: Re: [ADSM-L] Dedupe
>>
>> Short and clear answer about de-dupe:
>>
>> It depends.
>>
>> Hope this helps.
>>
>> ------
>> Mr. Lindsay Morris
>> Principal
>> www.tsmworks.com
>> 919-403-8260
>> lindsay AT tsmworks DOT com
>>

IMPORTANT:  E-mail sent through the Internet is not secure. Legg Mason 
therefore recommends that you do not send any confidential or sensitive 
information to us via electronic mail, including social security numbers, 
account numbers, or personal identification numbers. Delivery, and or timely 
delivery of Internet mail is not guaranteed. Legg Mason therefore recommends 
that you do not send time sensitive 
or action-oriented messages to us via electronic mail.

This message is intended for the addressee only and may contain privileged or 
confidential information. Unless you are the intended recipient, you may not 
use, copy or disclose to anyone any information contained in this message. If 
you have received this message in error, please notify the author by replying 
to this message and then kindly delete the message. Thank you.

<Prev in Thread] Current Thread [Next in Thread>