ADSM-L

Re: [ADSM-L] Frustrated by slowness in TSM 6.2

2010-10-09 21:00:59
Subject: Re: [ADSM-L] Frustrated by slowness in TSM 6.2
From: "John D. Schneider" <john.schneider AT COMPUTERCOACHINGCOMMUNITY DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 9 Oct 2010 17:59:55 -0700
Andrew,
   The crowd may be right, and the XIV may be your bottleneck for the
DB, but I wouldn't focus on that.  In your test environment, with only a
small number of backups running at once, there probably isn't all that
much database traffic generated, is there?  And not many database reads,
if much of your database should fit in memory.  Database writes should
be going to cache in the XIV, if it is as lightly loaded as you say, so
I don't see that as much of a bottleneck when only a few clients are
getting backed up.
   What kind of client backups are you testing?  Are they large file
database backups? Those can generate very good I/O throughput, because
the client is sending the data as fast as possible.  Or incremental
filesystem backups on Windows servers?  Those can generate very pool I/O
throughput, if they have to examine thousands of files for each file
that needs to be sent to the server.  Can you say with assurance that
the clients themselves are able to send more than 20-30MB/sec?
   Do you know what performance those same clients get when they backup
to your production environment? Try backing them up to their production
environment, at some time of night when the TSM server is not maxed out.
 Use that as a known starting point.  If you just want to test
throughput, and don't care about anything else:

1) Turn off client compression, if it is on.
2) Do "selective" backups of the whole filesystem, so the clients send
everything without having to make any time-consuming decisions about
what gets sent. 
3) Pick a time for the test with the client is very lightly loaded.
4) Try to pick a client with a small number of very large (multi-GB)
files, not zillions of small files.

Andrew, I know you already know these things, but I include them for the
benefit of the rest of the list.  The point I am making is to allow the
TSM client shove data across as fast as it can, and if it performs
really well, then the device that is absorbing all that incoming data
(The DataDomain, or other disk storage pool) is performing well.  If
another client is sending zillions of files, but performing very slowly,
maybe that client is creating a lot more traffic to the database, and
that is where your bottle neck.  In other words, different clients can
be used to show what part of the TSM server is the slowest performer.


Best Regards,

John D. Schneider
The Computer Coaching Community, LLC
Office: (314) 635-5424 / Toll Free: (866) 796-9226
Cell: (314) 750-8721



-------- Original Message --------
Subject: Re: [ADSM-L] Frustrated by slowness in TSM 6.2
From: Paul Zarnowski <psz1 AT CORNELL DOT EDU>
Date: Fri, October 08, 2010 11:37 pm
To: ADSM-L AT VM.MARIST DOT EDU

Rick,

I think their response would be something along these lines...
The XIV can perform better than other traditional arrays because the
[cache miss] I/Os are spread across so many more spindles. I get that.
But it seems to be that that can break down when the overall I/O load
gets sufficiently high, across all of the spindles. In an I/O
intensive environment such as TSM, I think this could be more likely
to happen - particularly if you are using XIV for storage pools as
well as for database volumes.

I'm still skeptical about how far it can go. I can buy that it has
good performance --- for a SATA-based product. But not compared to a
pure 15K spindle-based product. Oh, and the SATA drives are larger
than the SAS or FC drives, which doesn't help.

..Paul

At 01:57 PM 10/8/2010, Richard Rhodes wrote:
>> I would be suspicious of having the db on XIV. Do you have any FC
>> or SAS Disk you could try putting the DB on? I know XIV has lots
>> of CPU & cache, but underneath it all is still SATA. I've heard
>> Marketing types rave about how fast XIV is, even with SATA,
>> because I/O can be spread across many spindles, but I'm not
>> entirely convinced it's as good as 15k FC or SAS.
>
>This is _exactly_ what IBM has not, and seems unwilling, to explain.
>
>Soon after IBM finalized the purchase of XIV, they had a series
>of seminars around the country (usa) about the box. This wasn't some
>little out of the way seminar . . . Moshe (inventor of the box)
>was there and gave much of the presentation. I attended one - Lets
>just say it was strange!!! They hammered on "high performance", over
>and over. They threw up one graph where they claimed 25k iops at
>3ms response time for a "cache miss" workload. Lets see, cache miss
>means having to go to the spindle to do the I/O. SATA drives come
>no where close to this response time. The workload was either
>not cache miss, or, they effectively short-stroked the drive such
>that the heads never moved. When I questioned this claim I
>got nowhere - just run-around.
>
>Rick
>
>
>
>-----------------------------------------
>The information contained in this message is intended only for the personal 
>and confidential use of the recipient(s) named above. If the reader of this 
>message is not the intended recipient or an agent responsible for delivering 
>it to the intended recipient, you are hereby notified that you have received 
>this document in error and that any review, dissemination, distribution, or 
>copying of this message is strictly prohibited. If you have received this 
>communication in error, please notify us immediately, and delete the original 
>message.


--
Paul Zarnowski Ph: 607-255-4757
Manager, Storage Services Fx: 607-255-8521
719 Rhodes Hall, Ithaca, NY 14853-3801 Em: psz1 AT cornell DOT edu

<Prev in Thread] Current Thread [Next in Thread>