ADSM-L

Re: [ADSM-L] How to Incorporate a CDL into TSM environment?

2007-06-08 17:35:27
Subject: Re: [ADSM-L] How to Incorporate a CDL into TSM environment?
From: Curtis Preston <cpreston AT GLASSHOUSE DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 8 Jun 2007 17:34:03 -0400
Wanda,

We appear to have very different sets of data, because what I've seen in
the VTL and backup world is very different than what you're describing.
With respect to what you've seen, let me describe what I've seen.

VTLs are using the same type of disks that any other ATA-based storage
arrays are using.  If you're saying they're not using Fibre-Channel - of
course they're not.  But do you need Fibre Channel drives to receive
10-20 streams of data?  I don't think so.  Fibre Channel drives are
perfect for random I/O, which means they're probably perfect for a TSM
disk storage pool, as it may be receiving hundreds of simultaneous
backup jobs, which is very different than receiving 10-20-50 streams
that were designed to go to tape.  ATA drives do very well in streaming
operations -- exactly what backups going to tape generate.  In fact,
I've seen tests where ATA drives actually outperform Fibre Channel
drives in streaming ops.  

Yes, tape is still cheaper, but if you compare the price of a large VTL
with de-dupe to an equivalently sized tape library, they'll be a lot
closer than you think -- and the power and cooling will surprise you
too.  I've seen many scenarios where the VTL's power and cooling needs
were less than the tape library.

The people that I'm working with are not buying VTLs because they're
cheaper; in many cases they're more expensive than an ATA-based array of
equivalent capacity.  They're buying a VTL for the things VTLs bring to
the table, and those things come in play much more when we are talking
about large environment.

The first is ease of management.  It's one thing to buy a 20 TB array
and put that behind a single TSM server. It's another to buy that 20 TB
array and split it up into properly sized partitions for several TSM
servers.  Provisioning is just as big of a pain in the TSM world as it
is in the online world, and VTLs remove that problem.  They use thin
and/or over-provisioning where each server only consumes the amount of
storage it sends to the VTL, not the amount you said it could have.

The second thing VTLs bring to the table is hardware compression.
Notice I said hardware compression.  I'm not a fan of VTL software
compression. So I get to buy a 20 TB VTL that's a little more expensive
than a 20 TB ATA disk array, but what I get is 30-40 TB VTL, depending
on my compression ratio -- and I don't lose performance.

Then, of course, there's de-dupe, which most surveys are showing to be
the got-to-have technology of this year.  It's here.  It's real.  And it
really does shrink the amount of disk you need to use by a factor of
10-20:1, and even more depending on how you do your backups.

VTLs do indeed increase the speed of almost anyone's backups.  It's not
that disk/VTL is technically faster than tape.  IMO, tapes are now much
faster than disk.  The reason that VTL/disk can outperform tape is that
disk can go whatever speed your backup is going and tape cannot.  If I
send an 80 MB/s tape a 10 MB/s backup, it will shoe-shine and actually
write 5 MB/s.  A VTL would write 10 MB/s.

Most environments never get anywhere near their tape's capabilities and
about half or so are getting a small fraction of their tape drive's
capabilities.  VTL and disk bring THIS to the table.  And yes, what ends
up happening in almost every environment I've put a VTL in is that
backups go faster.  (There are some bad VTLs that actually slow down
backups.)

Finally, to one of the original questions of "why would anyone use a VTL
in a TSM environment," I say the following.  Most VTL companies are
telling me that a significant proportion of their customers are TSM
customers.  Why is that?  First, TSM customers have the same reasons
that other products' customers have for going to VTL.  Second, TSM
customers can benefit from a significantly higher number of tape drives
being available -- without having to pay for those tape drives.
Reclamation goes much faster with two virtual drives than two physical
drives, and you can throw as many drives at reclamation as you wish.

Just my $.02.

---
W. Curtis Preston


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Prather, Wanda
Sent: Friday, June 08, 2007 11:25 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] How to Incorporate a CDL into TSM environment?

You go John!
(And a BIG ditto on the compression rate issue - I've NEVER had a
customer that got 3:1 over the whole TSM environment.)
 
And let's step back a minute for a sanity check and ask, what IS a VTL
anyway?
It's disk with some cache and software in front.
 
So if you need to back up 20 TB of disk, why not do as Kelly says and
just buy another 20 TB of disk?
 
Answer:  
In most cases, people buy a 20TB VTL because it's cheaper than adding
another 20 TB in their disk array of choice.
 
Why do you think that is?  Is it because the vendors are really nice
guys?
 
Well, they may be really nice guys, but it's not because they want to
give disk away.
It's because the VTLs are built of A LESS EXPENSIVE KIND OF DISK.
The cheaper disk is slower.
 
Using cheaper disk, the VTL vendors have made it practical and cost
effective to eliminate tape backups, FOR SOME CUSTOMERS.
 
When people say they can back up or restore with a VTL faster than tape,
it may mean 
   1) they are replacing slow tape drives
   2) they are eliminating tape mount times
   3) they no longer have to wait for a tape drive
 
It doesn't mean there aren't cases where tape is faster.
 
There are cases where a VTL really rocks.  My favorite is using a VTL
for OFFSITE storage and backing up to it directly over fibre.  In case
of a major problem, you aren't limited in the number of tape drives you
have available for restore (you ARE still limited by the size of your
fibre pipe).  You don't have to physically move tapes around, and the
media never leaves your control  (If I never spend another minute doing
a manual audit looking for misplaced tapes...etc.).  And you don't have
to collocate in a VTL, since there is zero effective tape mount time.
And it is a good solution for people who want to do more Lan-free
backups, and are short of tape drives.  
 
But you should be buying a VTL for one of THOSE reasons, not for raw
speed.
You can always create a scenario where you get down to the actual device
speed of the underlying technology and hit that bottleneck.  Many people
never run into that scenario.  But some do.
 
Also, FWIW, tape is still cheaper per MB of storage than a VTL.  There
are price points where they are comparable, or where the benefits of a
VTL outweigh the cost differential.  But in general, the larger your
site in terms of TB to store, the more difference you will see in cost
if you go with a VTL vs. tape, with tape still being lower.  
 
You gotta first know what you are trying to do, THEN figure out where
your bottlenecks are, THEN figure out what technology matches your need
and fits your budget.
 
Wanda (I think I'm done for the day now and I'm sure glad it's Friday)
Prather
 
 

________________________________

From: ADSM: Dist Stor Manager on behalf of Schneider, John
Sent: Fri 6/8/2007 1:00 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: How to Incorporate a CDL into TSM environment?



Greetings,
        A lot of the chatter about VTL's being good or bad seems to stem
from which vendors you listen to, and what they are trying to sell you.
There are a lot of dogmatic statements made by people on both sides of
this issue, usually by people with no personal experience about what
they are talking about.  Somebody has fed them a sales line and they
dutifully parrot it back.
        EMC sold their CDL product for about two years before IBM
entered the market.  During that time you would not believe how many
times I heard IBM pooh-pooh the CDL saying it wasn't a good fit for TSM,
didn't perform well, whatever they had to say to compete against it.  I
even heard someone recently say it was against the law to use a CDL if
you used the IBM drivers to talk to it. Against what law exactly?
        Then after two years IBM came out with their VTL the TS7510, and
almost immediately came out with a Redbook about it with a TSM chapter
explaining why the TS7510 was such a good fit for TSM!  Huh? And not
because it was a better product than the EMC one, it was actually
slightly slower and only scaled to about a fourth the size of the
largest EMC VTL.  The only difference is that now IBM had something in
the marketplace, and that changed everything.

        As Wanda has said, a lot of the distinctions fall down to how
you use the VTL, and if your expectations are set correctly.  It is easy
for a vendor presentation to promise the moon without qualifying it's
claims.  A single-engine DL4100 from EMC can sustain a 1100MB/sec (3.7
TB per hour) write speed like they claim IF:

1) You are writing multiple simultaneous virtual tape streams (like 16
or more),
2) You balance the I/O across at least 4 FC streams coming in the VTL
engine,
3) You have at least 5 or more disk drawers to spread out the I/O load.
4) You are not compressing at the VTL engine.  If you compress at the
VTL engine, your performance will drop off, perhaps as low as a third as
fast.  This is because the compression is done in software.  If you want
hardware compression, go with one of the DL6000 series that has an
optional hardware compression engine.

But the presentations only say 1100MB/sec performance, and so customers
install one, set up a single backup to a single virtual tape drive, and
when it pegs at ~100MB/sec they think they have been lied to.

The other complaint I hear a lot is the claim of 3:1 compression.
Almost every vendor puts that in their literature as if it is a solid
fact, and not a typical value.  I had a customer once get so mad they
almost yanked the whole box out and made the vendor take it back because
they bought a 10TB VTL, which they sized on the assumption of 3:1
compression.  Never mind that the compression they were getting on their
existing LTO tape library was on 1.2:1, they were told the VTL would do
3:1, so it should. 

I had another customer almost throw out IBM because they bought 12 new
3592 tape drives, and they wouldn't perform anywhere near their rated
performance.  Never mind the fact that data was coming in through a
single GigE connection, and the 12 tape drives had an aggregate
throughput rating at about times that. 

Customers looking to purchase any tape or disk technology would be wise
to ask questions about how performance numbers were achieved, and look
at their own situation to see what results they should expect. 

Best Regards,

John D. Schneider
Sr. System Administrator - Storage
Sisters of Mercy Health System
3637 South Geyer Road
St. Louis, MO.  63127
Email:  schnjd AT stlo.mercy DOT net
Office: 314-364-3150, Cell:  314-486-2359


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Prather, Wanda
Sent: Friday, June 08, 2007 10:56 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] How to Incorporate a CDL into TSM environment?


...I understood restore performance suffered with a VTL - the way it has
been described to me is that, should a restore need to come from a
volume that has been destaged from disk to tape in the VTL, then a
restore of a single file from the volume  would first have to wait for
the vtl to rebuild the tape on disk? Or have I got the wrong end of the
stick?

Um.  Both.

Most VTL's are disk-only devices that emulate tape, and do not have the
staging issue you describe.

Many VTL's will make restores FASTER  because the tape mount time goes
from potentially minutes to a second or less.  (You also don't have to
worry about collocating data in a VTL, so your migration times are
generally faster as well.)

Now that goes with a caveat - you have to PIN YOUR VENDOR TO THE WALL
and get documentation about throughput rates.  ALL VTL's work about the
same way, but they all have different hardware inside the box, so you
can get drastically different results.  You can easily create a case
where restoring 1 VERY LARGE file will take longer on a slow VTL than
with fast tape (Say a TS1120, which run get more than 100MB/sec.)

It depends on
       WHICH VTL you are talking about,
       the speed of the disk in it,
       the size of the cache in it
      the speed of your SAN connection and/or HBAs
       compared to which tape drive, and
      whether you are talking about restoring lots of little files or a
few huge ones.


A VTS (don't they make this confusing?) is an IBM-only mixture of
disk/tape that emulates tape.  It has to pull data off tape and stage it
back to disk before you can restore.  Normally the VTS is used in a
mainframe environment.

IBM also makes VTLs, the TS7510 and TS7520, for use in open
environments.  They are all disk.

<Prev in Thread] Current Thread [Next in Thread>