Veritas-bu

Re: [Veritas-bu] Dedupe ratios and checkpoint backups

2011-09-06 15:07:20
Subject: Re: [Veritas-bu] Dedupe ratios and checkpoint backups
From: scott.george AT parker DOT com
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Tue, 6 Sep 2011 15:05:58 -0400
To answer your question, we tested both the NBU 5000 (Symantec) and the DD860 (Data Domain) and the differences couldn't be more stark.  I was using a 30 minute checkpoint interval.  I never achieved anything better than 15:1 from the NBU5000 (which isn't bad).  The DD860 hit 39:1 at the time I disabled all of the policies.  This was running daily full backups on an array of different servers (DB2, Windows file servers, UNIX file servers, and Siebel app servers) over the course of 2 months.  

The NBU5000 is a fixed block device, the DD860 a variable block device.  There is no way of knowing if checkpoints were the culprit for the NBU5000 getting the lesser ratio, but it does present a plausible theory.


From: scott.george AT parker DOT com
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: 09/06/2011 02:54 PM
Subject: Re: [Veritas-bu] Dedupe ratios and checkpoint backups
Sent by: veritas-bu-bounces AT mailman.eng.auburn DOT edu





If at all, this would probably affect fixed-block solutions more than the variable-block ones.  The variable-blocked solutions would continue to look for identical blocks, but in different positions of the data stream.  


From: "Stafford, Geoff" <GStafford AT barclaycardus DOT com>
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: 09/06/2011 02:45 PM
Subject: [Veritas-bu] Dedupe ratios and checkpoint backups
Sent by: veritas-bu-bounces AT mailman.eng.auburn DOT edu






Just wondering if anyone has done testing or seen documentation (from any dedupe vendor) regarding the usage of enabling checkpoints on backups that are being deduplicated?  I would think that the introduction of checkpoints every X minutes into the datastream would interrupt the continuity of the data and make it seem more unique thus negatively affecting dedupe ratios but I’m wondering by how much.  Most, if not all, of the variable length guys have the ability to ‘re-align’ themselves to the start of the files so I would think it might be more pronounced on large files vs your average server but I’m just thinking out loud.

 
Anyone seen a recommendation or actually tested themselves?




_______________________________________________________

Barclays

www.barclaycardus.com
_______________________________________________________

This e-mail and any files transmitted with it may contain confidential and/or proprietary information. It is intended solely for the use of the individual or entity who is the intended recipient. Unauthorized use of this information is prohibited. If you have received this in error, please contact the sender by replying to this message and delete this material from any system it may be on.
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu

http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu