There are some vendors that de-duplicate based on a sliding
window out of a stream of data that can be adversely affected by
multiplexing also. If you take a fixed block the statement I've gotten from
the vendors is that if you mix streams of data using multiplexing the de-dupe
ratio can decrease. Say for instance you are backing up 2 large databases
and you run a full every day. If you back up the databases as separate streams
you will get a great de-dupe ratio.
See the below simple "diagram", each |--------| is a
block..
|--------||--------||--------||--------||--------||--------|
111111111111111111111111111111111 (Database
1)
|--------||--------||--------||--------||--------||--------|
222222222222222222222222222222222 (Database
2)
Now, if you take
those 2 databases and multiplex them together, like
so:
|--------||--------||--------||--------||--------||--------|
122122221111212122112212212121121
My blocks can be
different now and I might not get the same de-dupe ratio.
Granted, this is a very simple representation of it and I'm
sure many people can pole many holes in this, but from the information I've
gotten from the de-dupe vendors is that that multi-plexing can change the way
the blocks are seen by the de-dupe engine and cause this type of inability to
de-dupe.
-Trey
On Wed, Apr 30, 2008 at 6:36 PM, Mike Sparkes <Mike.Sparkes AT quantum DOT com>
wrote:
Multiplexing mixes
streams of data from multiple sources into one stream to the storage device. A
de-duplication product on that storage device will be breaking up the stream
into blocks and looking for duplicate blocks. Let us assume that three backups
are being multiplexed and that no data changed since the previous backup. It
is unlikely that the data will mix together again at the same rate in the same
ratios to create the same blocks and so they will be treated as unique and
stored in full.
I disagree. Multiplexing doesn't mix up the blocks coming
from each server. You may see things like block 1 and 2 from server 1
followed by blocks 1, 2, and 3 from server 2, but if block 2 from server 1 is
the same as block 3 from server 2, it will de-dupe. It doesn't matter what
the speed is today versus yesterday - you're not de-duping the tape but you're
de-duping the blocks.
Duplication typically is not done by files - it's
done by blocks, and that isn't changing with multiplexing.
.../Ed
From: Ed
Wilts [mailto:ewilts AT ewilts DOT org] Sent: Wednesday, April 30, 2008 4:12
PM To: Mike Sparkes
Subject: Re: [Veritas-bu] Multiplexing on
VTLs
On Wed,
Apr 30, 2008 at 5:58 PM, Mike Sparkes <Mike.Sparkes AT quantum DOT com> wrote:
if you
ever move to de-duplication, the act of multiplexing your backups ruins
the ability to detect duplicate blocks. Your de-dupe ratio will be
terrible.
I
don't follow your logic here. Why would multiplexing affect the de-dupe
ratio?
--
Ed Wilts, Mounds View, MN, USA RHCE, BCFP, BCSD, SCSP mailto:ewilts AT ewilts DOT org
If I've helped you,
please make a donation to my favorite charity at http://firstgiving.com/edwilts
_______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|