Re: Performance

Nathan,

I did some calculations on your numbers.  16GB of 2K files
is 8,000,000 files.  Is this right?  Do you have that many on this server?
(ouch!)

Restoring them in 27 hours works out to 82 files per second.
Put another way, you have to create a new file in the client filesystem
every 12 milliseconds.  That's pretty fast.  If you wanted to do the
restore in a third of the time that would be a new file every 3 milliseconds.
Can your filesystem do this for sustained periods?

If you want to persue this, I suggest zipping up 100,000 of these files
and then unzipping them to a new spot.  This should be a clean test
of filesystem performance.  Your results would be very useful to us and
IBM.

--
--------------------------
--------------------------
Bill Colwell
Bill Colwell
C. S. Draper Lab
Cambridge, Ma.
bcolwell AT draper DOT com
--------------------------
In <777DE9DB19FAD111897E00204840162801849682 AT exchange20.usaa001 DOT com>, on 
07/23/99
In <777DE9DB19FAD111897E00204840162801849682 AT exchange20.usaa001 DOT com>, on 
07/23/99
   at 12:16 PM, Nathan King <nathan.king AT USAA DOT COM> said:

>Richard,

>I agree whole heartedly with your comments. However I think ADSM
>Developement is well aware of the performance degradation caused by small
>files.
>As noted in the ADSM NT Server readme.

>    Next to system hardware, the number and size of client files has a
>    big impact on ADSM performance.  The performance when tranferring
>    small files is lower than when transferring large files.

>    For example, using a 133MHZ pentium, FAT file system, local ADSM
>    client with no compression, and named pipes to a disk storage pool
>    throughput has been measured at:

>                                     File Size
>    Throughput          1KB      10KB    100KB     10MB
>    KB/Sec                57       445     1280     1625
>    GB/hr               0.20      1.53     4.39     5.58

>Ok so this is a pentium 133 but with it being a local client you can rule
>out the network as an issue.
>Named pipes is the fastest way to backup locally on NT.

>So even if I took the hardware and made it five times faster and supposing
>that by some magic that 5times faster hardware=5times faster adsm
>performance, my 16gb restore of small 1-2kb files would still have taken
>16hours!!

>Unacceptable!

>Nathan


>> To:   ADSM-L AT VM.MARIST DOT EDU
>> Subject:      Re: Performance
>>
>> ...
>> >However when it comes to restoring servers which are 90% small files then
>> we
>> >may aswell be on a 4Mb Token Ring. We once
>> >had the grand oppurtunity to restore the drive of an SMS Server which was
>> >made up of about 16Gb worth of 2Kb log files.
>> >This took 27hrs.
>> ...
>> > when it comes to small files Adsm just plain sucks!
>> ...
>> >Any thoughts?
>>
>> Nathan - You clearly had a very frustrating, unsatisfactory restoral
>>          experience.  But realize that it's not helping the many customers
>> on
>> the mailing list to just receive a splat without any details as to how the
>> restoral was performed...like network configuration, network load at the
>> time,
>> server system load, what else the ADSM server was doing, server
>> configuration,
>> server options, database cache hit ratio, client load, disk configuration,
>> file system topology, ADSM client options, restoral command, results from
>> testing various combinations in your configuration, etc.
>>
>> The great value in our mailing list is in discovering optimal techniques,
>> and
>> then sharing them, to the betterment of all.  Realize that it's plain
>> frustrating for the people on the list to see complaints rather than the
>> results from measured analysis.  And it does not point to solid evidence
>> that
>> IBM can react to in bettering the product - which will help everyone.
>>
>> Sure - there's a lot of frustration out in the customer base.  But I as a
>> customer feel that I have a responsibility both as a customer and a
>> professional in the data processing field to do some research, to prod the
>> system, change variables, and find out where the problems are.  We can
>> shower
>> IBM with raw complaints and they can tell us about KLOC numbers, but
>> working
>> together with specifics will really solve problems.  Yes, sometimes it's
>> necessary to shake up vendors to get them to be responsive to the
>> realities we
>> face, but the rest of the time customers and vendor need to dig in and get
>> to
>> the bottom of what's really wrong.  Hey, we're not here as a personal
>> pursuit:
>> we're in company and institutional positions as employees expected to
>> effectively deal with problems for the better operation of the
>> organization so
>> that we can all make more money!
>>
>> Naturally, not all customer sites have the resources to pursue things to
>> the
>> depth that better-equipped sites can, and hence depend upon the results
>> that
>> others find.  The mailing list is a dissemination and discussion point for
>> what ails us and what we can do for each other as well as solicit the
>> vendor,
>> IBM, to improve conditions beyond the pointed problems we need to bring to
>> the
>> Support Center.  IBM, in turn, needs to be responsive to the discontent
>> evidenced in the customer base, which is the List's great value to them;
>> and
>> they need to feed back on optimal techniques, which they often do very
>> well in
>> the famous Redbooks [please encourage them].  From management feedback to
>> the
>> List, it is very evident that IBM is listening; but what they particularly
>> want to hear is substantive feedback to allow them to respond as
>> substantively.  This is what we need to do.
>>
>> So channel that energy in ways that will help the general case.
>>
>>     Richard Sims, BU