ADSM-L

Re: [ADSM-L] Data Deduplication

2007-09-04 10:51:30
Subject: Re: [ADSM-L] Data Deduplication
From: Paul Zarnowski <psz1 AT CORNELL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 4 Sep 2007 10:48:21 -0400
Hi Wanda,

I'm thinking that deduplication might be especially useful for all
those copies of Windows System Objects that are backed up
periodically, for sites that have large numbers of Windows client
nodes.  TSM/Windows is unable to back them up incrementally, which
means each backup of a System Object is another copy.  If you keep
the default 3 copies, and have 500 systems, that's 1500
copies.  Granted, not backing up the files that haven't changed in
the first place would be best, but that doesn't seem to be an option
with Windows and TSM.  I don't know about front-end dedup such as Avamar.

With Vista, we see the System Object climbing to 7-8GB per copy.  In
the above scenario, that would be 12TB without deduplication.  Of
course, if you've chosen not to backup System Objects, then this
won't be a factor for you.

It would be easy to target just these System Object files to a
different storage pool on a dedup VTL.  The reduction for these files
should be substantial, I would think.

I would agree with your statement that you have to "know your data",
and think about this some.  I'm not convinced that throwing a dedup
VTL behind TSM for *all* of your data makes financial sense with
TSM.  I'm on the fence about this, until I see some hard
numbers.  But I do think there are some good opportunities for
putting a smaller dedup VTL behind TSM for *some* of your data, if
you know which data will dedup well, and if you have enough of it to
make financial sense in your shop.

..Paul


At 08:55 AM 9/1/2007, Wanda Prather wrote:
"It depends".

Just another thing to think about:

Yes, it sounds cool to reduce the footprint of all those XP files if you
have hundreds of XP systems.

But, at a site where we were backing up about 200 desktops along with
Windoze severs, I sat down and actually spent a bunch of time looking at
what was really getting backed up (there's no quick and easy way to get
this info out of TSM.)

Those OS files, while annoying, are read-only (translation, only 1 copy
per client) and are actually a very small part of today's very large hard
drives.  At that particular site where I did the study, I calculated that
the OS files from 200 Windows systems made up less than 10% of the total
data stored in TSM.

Result:  Not the place to spend $ or effort in reducing backup footprint.

That's not to say that de-dup won't save you bunches of space somewhere
else; just that you gotta KNOW YOUR DATA to figure out what is worth
doing.

YMWV..


--
Paul Zarnowski                            Ph: 607-255-4757
Manager, Storage Services                 Fx: 607-255-8521
719 Rhodes Hall, Ithaca, NY 14853-3801    Em: psz1 AT cornell DOT edu

<Prev in Thread] Current Thread [Next in Thread>