ADSM-L

Re: Daily Processing Performance (very slow - any ideas?)

2002-11-08 09:55:24
Subject: Re: Daily Processing Performance (very slow - any ideas?)
From: "Thach, Kevin" <KThach AT COVHLTH DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 8 Nov 2002 09:47:39 -0500
We only use mirroring for rootvg.  Anything on ESS we don't use mirroring
for.

We are using JFS.

We have our TSM allocation on the ESS as follows:

144 LUNs allocated, 18 LUNS on 8 LSS's.  We purchased and installed ESS
drive space for TSM and took most (if not all) of that drive space up front.
Therefore, the device /adapter /pair /array combination on each LSS for the
TSM disk space allocation, is the same.  Does that make sense?

The LUN assignment looks like this:
016-19732 -- 027-19732  Pair 1, Cluster 1, Loop B, Array 2
116-19732 -- 127-19732  Pair 1, Cluster 2, Loop B, Array 1
216-19732 -- 227-19732  Pair 2, Cluster 1, Loop B, Array 2
316-19732 -- 327-19732  Pair 2, Cluster 2, Loop B, Array 1
416-19732 -- 427-19732  Pair 3, Cluster 1, Loop B, Array 2
516-19732 -- 527-19732  Pair 3, Cluster 2, Loop B, Array 1
616-19732 -- 627-19732  Pair 4, Cluster 1, Loop B, Array 2
716-19732 -- 727-19732  Pair 4, Cluster 2, Loop B, Array 1

So that breaks down to 18 LUNs, on each of 8 LSSs, on one loop per LSS. At
the time of TSM disk space allocation, there were not any other loops
available for assignment.

We have some clients on Gigabit ethernet, but the majority are 100-Full. The
TSM server is on Gigabit.

By buffer pool are you referring to BUFPOOLSIZE?  If so, ours is currently
set to 262144.  Is that okay?

What are you referring to by maxperm?  How can I check it/change it if it
isn't set correctly?

Thanks very much,
Kevin

-----Original Message-----
From: Seay, Paul [mailto:seay_pd AT NAPTHEON DOT COM]
Sent: Thursday, November 07, 2002 5:12 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Daily Processing Performance (very slow - any ideas?)


Are you using mirroring?  If so, do not, waste of resources and lots of
overhead on backup db.  Are you using raw or JFS?  There is absolutely no
value defining such small LUNs on the ESS.  In fact that could be the source
of all your problems especially on the database.  If you are using JFS, you
should use a striped set of disk throughout the ESS, 8 at a time, on on each
cluster/loop combination.  This maximizes the throughput of the ESS.

You have a very similar environment to mine.  Our's flies.

Do you have Gigabit on the clients or 100baset?  Do you have Gigabit on the
TSM server?

First thing I would do is change your bufferpool for your DB to 256MB and
change maxperm to 40.  That will eliminate the 85% hit ratio spikes and the
paging that is probably occuring on your machine.

We backup about 2TB a day and process about 2TB per day to copies.  We start
at 7PM and finish by 7:30AM, including 2 copies of everything plus the
primary.  We have 16 Magstar drives.

Paul D. Seay, Jr.
Technical Specialist
Naptheon Inc.
757-688-8180


-----Original Message-----
From: Thach, Kevin [mailto:KThach AT COVHLTH DOT COM]
Sent: Thursday, November 07, 2002 12:39 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Daily Processing Performance (very slow - any ideas?)


Okay, some of you have requested more info before trying to make a
diagnosis, so let me give you some more details.

The 500GB disk pool and the 250GB disk pool are on ESS--8GB vpaths.  I have
144 of these vpaths allocated to our TSM server.  ESS uses RAID5.  Within
TSM, the volumes for the disk pools are 10GB volumes.  So, I have 50 volumes
for the noncollocated disk pool and 25 volumes for the collocated pool.

My DB and LOG use 1GB volumes.  Cache hit % on the DB is 98.5 today, but I
have seen it as low as 85%.

We have 4 fiber adapters in the TSM server.  2 are for disk and 2 are for
tape--so tape and disk are not operating on the same adapter.

Here is the output of vmtune from my TSM server:

vmtune:  current values:

  -p       -P        -r          -R         -f       -F       -N        -W

minperm  maxperm  minpgahead maxpgahead  minfree  maxfree  pd_npages
maxrandwrt
 209505   838020       2          8        120      128     524288        0



  -M      -w      -k      -c        -b         -B           -u        -l
-d
maxpin npswarn npskill numclust numfsbufs hd_pbuf_cnt lvm_bufcnt lrubucket
defps
838841   16256    4064       1      93       2128          9      131072
1


        -s              -n         -S         -L          -g           -h

sync_release_ilock  nokilluid  v_pinshm  lgpg_regions  lgpg_size
strict_maxperm
        0               0           0           0            0        0



number of valid memory pages = 1048551  maxperm=79.9% of real memory

maximum pinable=80.0% of real memory    minperm=20.0% of real memory

number of file memory pages = 838012    numperm=79.9% of real memory

Thanks again.  If you need more info still, just let me know. Kevin

-----Original Message-----
From: Mark D. Rodriguez [mailto:mark AT MDRCONSULT DOT COM]
Sent: Thursday, November 07, 2002 11:19 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Daily Processing Performance (very slow - any ideas?)


Thach, Kevin wrote:

>I'm very dissapointed with the performance of our TSM environment, and
>I
was
>curious what kinds of numbers some of you with similar environments
>have experienced.  I've worked extensively with IBM to try and tune
>things, but apparently we've got everything adjusted correctly.  We are
>in the process of cleaning up System Object stuff, so I'm wondering if
>I should expect things to improve dramatically once we trim all that
>fat.
>
>I apologize for the lenght of the post, but I want to include as much
>info as possible.  I'm in dire need of a solution.
>
>Our basic setup:
>
>IBM 6H1 - 6 Processors / 4GB RAM
>TSM 4.2.3 on AIX 4.3.3-09
>
>LTO 3584 Tape Library with 10 drives
>Fiber-arbitrated loop going through McData ES-1000 switches and McData
>6064 Directors
>
>500GB Non-Collocated Disk Pool on ESS
>250GB Collocated Disk Pool on ESS
>37GB TSM database
>
>Approximately 200 Clients (mixture of AIX and WinNT/2K) running at
>various client versions
>
>200GB total / night backed up on average.
>
>Daily Processing is slow, slow, slow.
>
>Here are the steps for our daily processing (its all scheduled, but I'm
just
>showing you what runs when):
>
>1) 7:00:00 - Daily processing starts
>backup stg nocodisk copypool maxproc=4 wait=yes
>backup stg colodisk copypool maxproc=4 wait=yes
>
>2) Once that is finished the migrations start (I have the maxproc on
>both pools set to 5) update stg nocodisk hi=0 lo=0
>update stg colodisk hi=0 lo=0
>
>3) Once Migration is finished
>update stg nocodisk hi=90 lo=70
>update stg colodisk hi=90 lo=70
>backup stg nocotape copypool maxproc=3 wait=yes
>backup stg colotape copypool maxproc=3 wait=yes
>
>4) Once that is finished
>expire inventory
>
>5) Once that is finished
>backup db devclass=ltotape type=full
>
>6) Then
>backup volhist
>backup devconfig
>prepare
>
>So, the big disappointment is on steps 1 and 2.  Our disk to tape
>performance averages about 20GB/hour per tape drive.  If I reduce the
number
>of mount points, that number goes down even more.  Are LTO's really
>this slow?  IBM says these suckers will do 50-100GB/hour.  With 10
>drives, we were told we could handle about 1-2 TB/day, and we're only
>dealing with 200GB and the entire daily processing takes more than 6
>Hours!!!
>
>At one time we were using disk caching, and I was told that slowed down
disk
>to tape performance, so we turned it off.  I saw a slight improvement,
>but nothing major.  The Noncollocated disk pool still has data in it
>that has not expired yet since I turned off caching.  Could that still
>be slowing things down if the pool isn't completely flushed?
>
>What can I really expect to get as far as performance?  How long should
>daily processing really take for only 200GB worth of data?
>
>Any help is greatly, greatly appreciated!
>
>Thanks!
>-Kevin
>
>This E-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended
>only for the use of the Individual(s) named above.  If you are not the
>intended recipient of this E-mail, or the employee or agent responsible
>for delivering it to the intended recipient, you are hereby notified
>that any dissemination or copying of this E-mail is strictly
>prohibited.  If you
have
>received this E-mail in error, please immediately notify us at (865)
>374-4900 or notify us by E-mail at hdesk AT covhlth DOT com.
>
>
Kevin,

Although you have given us a lot of good info I think we might need some
more.  I suspect that the performance problems maybe related to file I/O
performance.  I would like to know how your disk storage pools, DB and Log
files are laid out.  Remember, file I/O tuning is like realestate there are
only 3 rules Location, Location, Location!  Well that might not be
completely true, but anyway.  Give us layout info like RAID levels, array
layouts and connection info, are you using vpath etc.  Also, send the output
of vmtune command with no parms.  There has been a lot of discussion on the
list about tuning and your environment is fairly common.  Paul Seay has done
a lot of stuff with the ESS as well so he will probably have some
suggestions.  Once we have the info we will probably be able to help.

--
Regards,
Mark D. Rodriguez
President MDR Consulting, Inc.

============================================================================
===
MDR Consulting
The very best in Technical Training and Consulting.
IBM Advanced Business Partner
SAIR Linux and GNU Authorized Center for Education
IBM Certified Advanced Technical Expert, CATE
AIX Support and Performance Tuning, RS6000 SP, TSM/ADSM and Linux Red Hat
Certified Engineer, RHCE
============================================================================
===

This E-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended only
for the use of the Individual(s) named above.  If you are not the intended
recipient of this E-mail, or the employee or agent responsible for
delivering it to the intended recipient, you are hereby notified that any
dissemination or copying of this E-mail is strictly prohibited.  If you have
received this E-mail in error, please immediately notify us at (865)
374-4900 or notify us by E-mail at hdesk AT covhlth DOT com.

This E-mail contains PRIVILEGED AND CONFIDENTIAL INFORMATION intended only
for the use of the Individual(s) named above.  If you are not the intended
recipient of this E-mail, or the employee or agent responsible for
delivering it to the intended recipient, you are hereby notified that any
dissemination or copying of this E-mail is strictly prohibited.  If you have
received this E-mail in error, please immediately notify us at (865)
374-4900 or notify us by E-mail at hdesk AT covhlth DOT com.