ADSM-L

Re: [ADSM-L] TSM dream setup

2008-02-28 16:21:43
Subject: Re: [ADSM-L] TSM dream setup
From: "Hart, Charles A" <charles_hart AT UHC DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 28 Feb 2008 15:21:00 -0600
Interesting, thank you for sharing.  Our challenge was the bottleneck
was that the Data Domain Device using Ethernet, (not sure if the device
has capable multiple gige's to trunk).  I thought I heard that Data
Domain now offers a FC based device ... (i.e. FC to Disk )

As far as compression ratio's yes... You will always have data types
that will be unique every time and wont de-dupe, and that Virtual tape
of any flavor is no place for long term data. (Long term is different to
different people, for us its 21 Days due to volume of daily inbound
data.)

If you have the opportunity to get to know this data that wont compress
well, sometimes you'll find that its being compressed or encrypted, if
encrypted then you probably have to live with it.

We learned a lot about data types and how they come at us for example if
Oracle RMAN uses Files Per Set more than a value of 1 then the RMAN
streams are intermixed and are different every time reducing the
Dedupability... 

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Ben Bullock
Sent: Tuesday, February 26, 2008 4:40 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TSM dream setup

Ok, I thought I would reply back here about our experience in
implementing a DataDomain 580 appliance into our TSM environment.

Setup - Easy. Put it on the network and NFS mounted it to our AIX/TSM
server.

TSM config - Easy. Created some "FILE" device classes and pointed them
to the NFS mount points.

Migration of data from tape to DataDomain appliance - Easy. "Move data",
"move nodedata", etc. work great.

Performance - We are getting a consistent 90MB/Sec in writing to the
device and a little better on reads. This is pretty much the limit of
the 1GB adapter we are running the data through. That equates to about
8TB of data movement a day, acceptable for our environment. NICs could
be combined for better throughput.

Dedupe/Compression - Here is where the answer from the vendor is always,
"It depends on the data". And indeed it does, but here is what we are
getting:

DB dumps - Full SQL, Sybase, and Exchange server DB dumps. 
       Original Bytes:   49,945,140,504,962  (TSM says there is this
much data)
  Globally Compressed:    5,956,953,849,746  (this is how large it is
after deduplication)
   Locally Compressed:    2,792,002,425,204  (this is how big it is
after lz compression)
        about an 18 to 1 compression ratio.

Filesystems - OS files, document repositories, image scans, windows
fileservers, etc.
       Original Bytes:   27,051,578,287,711  (TSM says there is this
much data)
  Globally Compressed:    7,907,366,156,093  (this is how large it is
after deduplication)
   Locally Compressed:    4,499,161,648,844  (this is how big it is
after lz compression)
        about an 6 to 1 compression ratio.
  
Overall deduplication/compression on our TSM backups: ~ 10 to 1
compression.

It's kinda like night and day between the fileserver and database
compression rates. We have found that some server's data is very
un-deduplicatable (is that a made-up word, or what?). Here are some
examples:

- A 6TB document repository with TIFF and PDF documents is only getting
about 5 to 1 compression.
- The VMWARE ESXRANGER backups are compressed so we get virtually NO
dedupe when the data goes to the appliance. We are in the process of
re-working this.
- A large application in our environment puts out data files that are
also non-deduplicatable. Who knew. No way to tell until you shovel it to
the appliance and see that it sucked and then shovel it back out to tape
for the time being.

We were well aware that some data isn't really fit for this expensive
appliance, so we are looking into other ways to put that TSM data on
disk and replicate it for DR (perhaps a NAS appliance). 

Overall, we are pleased with the appliance. The ability to replace a
whole tape library with a 6U appliance frees up a lot of computer room
space. And using 1/10th of the power to keep disks spinning (we are
fitting about 100TB of data onto a 10TB DataDomain), feels very "green"
and saves money in HVAC and power. 

Oh ya, and restores are almost instantaneous for individual files, and I
can restore whole filesystems now in a reasonable amount of time. YMMV
of course, it still depends on the number and size of the files. But it
is even faster than before when we were using collocated tapepools on a
LTO2.

Neat new technology 

Ben


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Paul Zarnowski
Sent: Friday, February 15, 2008 6:54 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] TSM dream setup

>About deduplication, Mark Stapleton said:
>
> > It's highly overrated with TSM, since TSM doesn't do absolute (full)

> > backups unless such are forced.

>At 12:04 AM 2/15/2008, Curtis Preston wrote:
>Depending on your mix of databases and other application backup data, 
>you can actually get quite a bit of commonality in a TSM datastore.

I've been thinking a lot about dedup in a TSM environment.  While it's
true that TSM has progressive-incremental and no full backups, in our
environment anyway, we have hundreds or thousands of systems with lots
of common files across them.  We have hundreds of desktop systems that
have a lot of common OS and application files.  We have local e-mail
stores that have a lot of common attachments.

While it may be true that overall, you will see less duplication in a
TSM environment than with other backup applications, with TSM you also
have the ability to associate different management classes with
different files, and thereby target different files to different storage
pools.  Wouldn't it be great if we could target only the
files/directories that we *know* have a high likelihood of duplication
to a storage pool that has deduplication capability?  You can actually
do this with TSM.  I'd like to see an option in TSM that can target
files/directories to different back-end storage pools that is
independent of the "management class" concept, which also affects
versions & retentions and other management attributes.


..Paul



--
Paul Zarnowski                            Ph: 607-255-4757
Manager, Storage Services                 Fx: 607-255-8521
719 Rhodes Hall, Ithaca, NY 14853-3801    Em: psz1 AT cornell DOT edu

The BCI Email Firewall made the following annotations
---------------------------------------------------------------------
*Confidentiality Notice: 

This E-Mail is intended only for the use of the individual or entity to
which it is addressed and may contain information that is privileged,
confidential and exempt from disclosure under applicable law. If you
have received this communication in error, please do not distribute, and
delete the original message. 

Thank you for your compliance.

You may contact us at:
Blue Cross of Idaho
3000 E. Pine Ave.
Meridian, Idaho 83642
1.208.345.4550

---------------------------------------------------------------------


This e-mail, including attachments, may include confidential and/or 
proprietary information, and may be used only by the person or entity to 
which it is addressed. If the reader of this e-mail is not the intended 
recipient or his or her authorized agent, the reader is hereby notified 
that any dissemination, distribution or copying of this e-mail is 
prohibited. If you have received this e-mail in error, please notify the 
sender by replying to this message and delete this e-mail immediately.

<Prev in Thread] Current Thread [Next in Thread>