ADSM-L

Re: TSM Server Paging: Need an AIX Paging Expert to Help

2002-09-15 19:24:49
Subject: Re: TSM Server Paging: Need an AIX Paging Expert to Help
From: Dan Foster <dsf AT GBLX DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 15 Sep 2002 23:16:07 +0000
Hot Diggety! Seay, Paul was rumored to have written:
> Paging is a bad thing, but why would an AIX system page (like crazy) under
> this configuration?

Well, it depends on what kind of paging it is. Some paging is harmless
if it's the right type, and others are downright bad.

> P660-6H1 2GB Memory
> dsmserv using 384MB
> Swap space is 3GB an only 12% used.
> Nothing else seems to use much memory based on a ps -efl

12.x% used is exactly right -- that's 368-ish MB, so that's probably the
VMM memory-backing of the dsmserv memory allocation. That also seems to
suggest that nothing else is allocated paging space pages, and about what
I would expect for your setup. Most unlikely anything is actually swapping.

# lslpp -l bos.rte.libc
# vmstat 1 120
# iostat 1 120
# lsps -a
# ps -efl

are the usual stuff I'd run first. I'd like to see the output of these
five commands -- you're more than welcome to email it my way if you'd
like (iostat output may be a little large).

There are certain bugs such as bos.rte.libc version 4.3.3.77 that has a
very nasty memory leak which, over time, triggers this sort of behavior.
Probably not the case here, though. Doesn't hurt to check, anyway.

Also, you're characterizing this as excessive paging, but need to see
evidence of that... as the evidence should also point at the culprit,
too.

Just to verify - you or anyone else haven't modified the default vmtune
settings? (that's a special utility used to modify the low level kernel
policy for various special VMM tunables.) The default vmtune setings
says that 80% of physical memory is reserved for the unified buffer cache
(program+file data) and 20% is reserved for 'pinnable' (can't move) memory.

If this was altered, it's possible to run into excessive paging if the
wrong values were chosen. (And the reason why IBM doesn't publicize or
encourage use of this utility by the uninitated. Usually altered only for
large databases.)

Even if you see 'nothing' in vmstat output, it *does* give hints in
certain fields that tells us if it's actually paging or if it's just a
"false alarm". (such as the context switches/sec, free page searches,
I/O wait, etc.)

6H1s are real nice machines, incidentally. :) Same machine we're running
the TSM 4.2/5.1 stuff on. I've set up the service processor with a special
'halt NOW' string to be seen on serial port so I can basically power it off
and on remotely, and use a number of the other nice 6H1 features such as
slot power on/off (plus TSM's enable/disable individual drives) to work on
specific cables/drives without halting the entire service and application.

Final question: exactly what commands did you run and what output did
you get that caused you to determine you had excessive paging? Are you
seeing I/O slowdowns? (I'm guessing this may be related to your other
recent mail on abnormally slow expiration issues.)

-Dan