I
have always understood it that Checkpoints were saved in a
log on the client and, therefore, wouldn’t affect dedupe
ratios at all. I haven’t ever verified that, nor did a quick
search yield anything.
Haven’t
done any comparisons but we use checkpoint for our big ERP
DB (near 6 TB) backup and still get good compression
ratios. There is nothing that has made me think I need to
look at it or tweak it to get better. I’d say the benefit
of not having to restart a huge backup from scratch offered
by checkpoints would outweigh deduplication ratio issues
unless you have infinite time to run backups.
To
answer your question, we tested both the NBU 5000 (Symantec)
and the DD860 (Data Domain) and the differences couldn't be
more stark. I was using a 30 minute checkpoint interval. I
never achieved anything better than 15:1 from the NBU5000
(which isn't bad). The DD860 hit 39:1 at the time I
disabled all of the policies. This was running daily full
backups on an array of different servers (DB2, Windows file
servers, UNIX file servers, and Siebel app servers) over the
course of 2 months.
The
NBU5000 is a fixed block device, the DD860 a variable block
device. There is no way of knowing if checkpoints were the
culprit for the NBU5000 getting the lesser ratio, but it
does present a plausible theory.
If
at all, this would probably affect fixed-block solutions
more than the variable-block ones. The variable-blocked
solutions would continue to look for identical blocks, but
in different positions of the data stream.
Just wondering if anyone has done testing or seen
documentation (from any dedupe vendor) regarding the usage
of enabling checkpoints on backups that are being
deduplicated? I would think that the introduction of
checkpoints every X minutes into the datastream would
interrupt the continuity of the data and make it seem more
unique thus negatively affecting dedupe ratios but I’m
wondering by how much. Most, if not all, of the variable
length guys have the ability to ‘re-align’ themselves to the
start of the files so I would think it might be more
pronounced on large files vs your average server but I’m
just thinking out loud.
Anyone seen a recommendation or actually tested themselves?
_______________________________________________________
Barclays
www.barclaycardus.com
_______________________________________________________
This e-mail and any files transmitted with it may contain
confidential and/or proprietary information. It is intended
solely for the use of the individual or entity who is the
intended recipient. Unauthorized use of this information is
prohibited. If you have received this in error, please contact
the sender by replying to this message and delete this
material from any system it may be on. _______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu_______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
Proud
partner. Susan G. Komen for the Cure.
Please
consider our environment before printing this e-mail or
attachments.
----------------------------------
CONFIDENTIALITY NOTICE: This e-mail may contain privileged
or confidential information and is for the sole use of the
intended recipient(s). If you are not the intended
recipient, any disclosure, copying, distribution, or use of
the contents of this information is prohibited and may be
unlawful. If you have received this electronic transmission
in error, please reply immediately to the sender that you
have received the message in error, and delete it. Thank
you.
----------------------------------