“unnecessary copies” made me
remember to say its important to plan on restore time as well as backup time.
While it may be “redundant” to
have all your files on every backup it sure saves a lot of time when you only
have to restore the LAST backup.
That is to say that theoretically you
could eliminate all sorts of redundancy by simply doing a full backup when you
first install then do CumulativeIncrementals from that day forward. 2 years
later when you finally crashed you could restore the original full and all the
intervening CumulativeIncrementals to get back to where you were but you’d
likely have gotten there just as fast by reinstalling and rekeying all the
missing data.
Most people strike the balance by doing Fulls
once a week and either Incrementals or CumulativeIncrementals the rests of the
week. Sometimes the size and type of data being backed up makes a difference
as well. For example we do daily backups (using BCV copies) of our 2+ TB
Production Database.
From:
veritas-bu-bounces AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Michaels, Keith R
Sent: Tuesday, April 08, 2008
11:06 PM
To:
veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu]
Measuring redundant backup data
It should be possible to go through the
catalog and determine how much redundancy is present based on the schedules and
retentions. For example if the schedule calls for monthly fulls and the same
file appears in 12 consecutive full backups (without appearing in any
intervening incremental) then that's 10 unnecessary copies, assuming 2 are
needed for adequate protection. I know there's additional duplication if
the same file exists on two clients but that's harder to measure without
comparing the data. I'm just interested in measuring the unnecessary
copies that were created as a result of multiple backups of the same data.
From: Ed Wilts
[mailto:ewilts AT ewilts DOT org]
Sent: Tuesday, April 08, 2008 7:05
PM
To: Jeff Lightner
Cc: Michaels, Keith R; veritas-bu AT mailman.eng.auburn DOT edu
Subject: Re: [Veritas-bu]
Measuring redundant backup data
On Tue, Apr 8, 2008 at 5:57 PM, Jeff Lightner <jlightner AT water DOT com> wrote:
I don't know a way to measure how much is "redundant" easily.
Maybe the
much vaunted Aptare would have that - I'll wait for their fan club to
comment on that. :-)
Not a chance - Aptare just gets job status and doesn't ever see the
backup data.
So far it appears to us the deduplication devices are living up to or
exceeding expectations.
That's purely site specific. With PureDisk backing up our remote
sites, I think we're under 5:1 but we're still building up the generation
count. When we pointed some of larger main campus data at it, it wasn't
even that high - nowhere near high enough to justify the cost.
Some vendors will let you eval you a unit - that's the only way to know how
well you're going to dedupe because it is so client specific. If you have
a ton of application servers with mostly OS and little application, you're
going to de-dupe extremely well. If you have 1 file server full of TIFF
data that never stays around very long, you won't de-dupe well at all.
--
Ed Wilts, Mounds View, MN, USA
mailto:ewilts AT ewilts DOT org