Re: Optimal VMTUNE Guidelines for a TSM Server

Mark, I am sure your recommendations will bring rain.  But, it is the most
definitive response to this question yet.  This thread is going to be a gold
mine when done.  I bet there are hundreds of TSM servers that could benefit
from a little tuning in this area.

We need to develop a simple calculator for the dedicated TSM server and
generates a default vmtune recommendation to start with.  This is a simple
shell script.

Putting on my 25 years of I/O and memory management experience with MVS hat
and now being clearly explained how all of these knobs interact along with
some other stuff that I have read, I now have a starting place.  And, as
always you change one knob at a time unless they are dependent on each
other, then measure, then make the next adjustment.

At the end of the day the hum from this machine should be able to be heard
around the world.

Thanks everyone for the input.  Keep the thread going.  I will provide some
feedback on what seems to work with my  very high-end environment, 6H1,
Shark Disk, Magstar tape, all fibre channel.

Paul D. Seay, Jr.
Technical Specialist
Naptheon Inc.
757-688-8180


-----Original Message-----
From: Mark D. Rodriguez [mailto:mark AT MDRCONSULT DOT COM]
Sent: Monday, September 16, 2002 8:46 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Optimal VMTUNE Guidelines for a TSM Server


Seay, Paul wrote:

>I am trying to figure out what these are.  The defaults are not good on
>a large server.
>
>The suggestion is figure out how much memory does dsmserv need and then
>work from there.
>
>So lets take the example of a 2G server with a buffer pool of 256MB and
>an overall memory requirement of 400MB.  That would make you think
>there is about 1.6GB left around.  Problem is the default filesystem
>maxperm is 80% of the 2GB or about 1.6GB.  This would mean nothing left
>for the rest of the processes, thus lots of paging.  I am thinking a
>buffer of about 128MB should be in there.  So, in this case, maybe set
>maxperm to 65%.
>
>The real question is what other vmtune knobs should be considered in a
>TSM server.  The IO prefetch, large or small?  Is there a book on how
>to do this?
>
>ETC
>ETC.
>
>Paul D. Seay, Jr.
>Technical Specialist
>Naptheon Inc.
>757-688-8180
>
>
Paul,

You are asking very interesting questions.  I teach the AIX Performance
tuning class and we spend the better part of a day discussing VMM. Needless
to say, I can't review that in just one note.  However I do want to take
sometime to explain the theory behind this in order to justify my settings.
Also, you may want to choose different values based on your environment.
The important thing is to see the big picture here and to realize that AIX
VMM does not work like any other OS's Virtual Memory Management.  In
addition, before adjusting anything with vmtune document current configs.
Also, remember vmtune settings do not survive a reboot, therefore add the
command to inittab.  However, I would do it by hand a test for quite
sometime before adding it to inittab since you can easily cripple a machine
with this command.  Now having said that let me see if I can't help you with
your questions.

First it is important to understand what MaxPerm(-P) and MinPerm(-p) really
are.  Within the vmtune command MaxPerm and MinPerm are used to set the
ratio of "computational pages" vs. "permanent pages" that are kept in real
memory at any given time.  Computational pages are comprised of your code,
work, and library segments, i.e. all the things that are needed for the
process to run.  Permanent pages are considered to be those pages from
persistent segments that are essentially open files (but not including open
files for the code itself).  And before you internal gurus jump on me, yes
there are some pages that are neither computational or permanent, but for
this discussion we can leave them out.  You mentioned that you were setting
a DB Buffer Pool of 256MB. This area will be part of the processes working
segment (sometimes called the private or data segment).  You can see a
processes "computational pages" with the "ps avx" command and look at the
"RSS" column or look under "SIZE" to see the virtual size of the "working
segment".

Now lets look at what happens when we read a DB page.  AIX's LVM and File
I/O algorithms will get the file page into a persistent segment referenced
by the ITSM server process.  The server process then reads it into its DB
Buffer pool which is part of the server processes working segment.  Which
means the data is in memory in 2 different places right now.  Server process
will manage the one in the DB Buffer pool and VMM will mange the one in the
persistent segment.  VMM will decide when to steal that page (note: it will
not get paged out, only working pages go to page space) based on its page
stealer algorithm which is partially tunable by the values set with vmtune.
Your goal is to keep the DB Buffer pool in memory and not have it page out.
Therefore, I set my MaxPerm and MinPerm values very low, i.e. 40% and 10%
respectively.  You could probably be even more aggressive, but this is a
safe starting place.  This will place a much higher priority on keeping
"computational pages" in memory.

If your box is a dedicated ITSM server than this type of tuning lends it
self to the other activities on you box.  When file I/O is such that I
access a file page one time and than don't access again it is best to keep
as few file I/O pages (i.e. Permanent pages) in memory as possible which
leaves more room for "computational pages".  This is exactly what ITSM is
doing with the exception of the ITSM DB, but that is being buffered by the
DB Buffer pool.

You asked about other tunable and some starting points there.  You can look
at your MaxPgAhead(-R) and MinPgAhead(-r) values.  I have found little value
in setting these arbitrarily large, but increasing them above the defaults
is useful.  Set your  MaxPgAhead value so that you can fill all of the disk
queues in a single read ahead, i.e. 8 disk raid array with a queue depth of
4 for each drive,  MaxPgAhead=32. MinPgAhead should be set so you get to the
max value in 2 or 3 steps, in this case I would set it to 4.  I have seen
people set these numbers real high, but I have not measured any true
improvements when that was done.  When you adjust you  MaxPgAhead value you
should adjust the
MaxFree(-F) and MinFree(-f) values so that the difference between them is at
least  MaxPgAhead, typically I raise the MaxFree value.

MaxRandWrt(-W) can be adjusted if DB writes seem sporadic, i.e.
periodically, like  once per minute,  DB writes seem to take a long time.
You can start with this value between 32 and 128.  If you set this value it
will be better if you enable SyncReleaseInodeLock since your DB writes will
be much more frequent.

If your server is doing constant page outs to page space (vmstat po
column) then you can set "-d 1" which will change the page space allocation
algorithm to a more favorable one for you environment.

Other values to look at that may provide  slight to moderate improvements.
There is a max_coalesce value for some raid adapters.  If you have one you
can set the value to 65K to 128K for starters.  This will group several
small I/O reads together into larger single reads. Also, for non-IBM scsi
disk arrays look at the "num_cmd_elems" on the scsi adapter that drives it.
Make sure that this is large enough to handle the total of all the drive's
queue depth value.

As you can see this is a topic that can go on and on and on!  I hope this
gives you some insight on this.  I would be glad to talk to you in person
about this if you like.  Just send me an e-mail directly and we can hook up.
I would enjoy talking with you about several ITSM topics. I must thank you
and tell you I appreciate all that you contribute to this list.

--
Regards,
Mark D. Rodriguez
President MDR Consulting, Inc.

============================================================================
===
MDR Consulting
The very best in Technical Training and Consulting.
IBM Advanced Business Partner
SAIR Linux and GNU Authorized Center for Education
IBM Certified Advanced Technical Expert, CATE
AIX Support and Performance Tuning, RS6000 SP, TSM/ADSM and Linux Red Hat
Certified Engineer, RHCE
============================================================================
===