Re: [ADSM-L] Data Deduplication
I thought we DID address that in one of the posts. (Maybe I'm getting
things confused with another thread I'm having on the same topic.)
A properly designed de-duplication backup system should restore the data
at the same speed as, if not faster than the backup, and the tests that
I've done with a few of them have all worked this way. I believe it's
something you should test, but it appears that the designers thought of
this natural objection and designed around it.
I believe it has to do with the fact that restoring 100 random pieces to
create a single file means you get to read off of a bunch of spindles.
I will say that there are speed differences between the de-dupe
appliances (VTLs) and de-dupe backup software. De-dupe backup software
still restores fast enough for what it was designed for. (You should be
able to fill a GbE pipe with such a restore.) But they're not going to
restore at the 100s of MB/s that you can get out of one of the
W. Curtis Preston
Backup Blog @ www.backupcentral.com
VP Data Protection, GlassHouse Technologies
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Sent: Friday, August 31, 2007 3:13 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Data Deduplication
On Aug 31, 2007, at 4:33 PM, Dave Mussulman wrote:
> ... Avamar said their software got
> 10-20% reduction on a backup of a stock Windows XP installation. A
> single system, say it's the first one you added to your backup group.
> That's not two users with the same email attachments saved, or
> files across two systems - that's hashing files in the OS (I presume
> from headers in DLLs and such.) ...
I'm mildly amused that in all these postings on the subject, none has
addressed the corollary of the backups: restoral. There are likely
some implications in the restoral of files backed up this way,
perhaps most particularly in system files; and restoral performance
is also something one would wonder about. And there may be
situations where such a backup/restore regimen is to be avoided,
because of issues. Perhaps those with experience in this area would
post what they've found.
Richard Sims, at Boston University