nv-l

Re: [nv-l] appl queue size question

2005-05-25 10:23:46
Subject: Re: [nv-l] appl queue size question
From: James Shanks <jshanks AT us.ibm DOT com>
To: nv-l AT lists.us.ibm DOT com
Date: Wed, 25 May 2005 10:23:16 -0400
Scott,

I hesitate to say it, but the phrase, "Luke, you are messing with powers
you cannot possibly understand," comes to mind.
(Guess which movie we saw recently?)   And I'll apologize now for that
feeble attempt at humor, while I attempt to answer your question.  Of
course, you can understand, once someone explains what you are actually
looking at.  So here goes.

Basically, you have an application queue size of 5000 events, period.  The
55042 is a process id, and is irrelevant.  That's all the trace tells you
at this time, except that the queues are not backed up, since you are
seeing  one event being added, and then immediately deleted.  Running this
script when you actually have a problem with events being behind might tell
you how close the appl queues are to being full, but running it now when
you don't have a problem, tells you nothing much.  By itself,  this script
is not a performance  analysis tool, but only a diagnostic aid.

Now, since the default application queue size in trapd is 2000 events,
yours has already been changed at least once and is more than double the
usual amount.  Apparently someone has been tuning this before.   So what
problem are you trying to solve, what symptoms are you seeing?

This queue size determines how many events trapd will pass to connected
application which is not responding (or responding too slowly) before he
closes their socket connection to him.  He does so in order to avoid his
own demise from lack of storage.   Usually, the only reason to alter this
size is that you have periodic traps storms, so the connected applications
get a whole bunch of traps all at once, after the storm initially subsides,
and now they have a lot to do to catch up.  So you raise the size of the
queues to hold more events so they can do that.  Otherwise, they get forced
off and all the events in the queue for them are discarded.  Sometimes that
really is the best thing to do, let them get forced off, and sometimes not.
It's a trade-off.  If they don't get forced off, then they get backed up,
and it may take while awhile for them to catch up.

Unfortunately. there is no tool I know of which can tell you how big you
should make the application queue size if you don't  want the appls forced
off.   And I should know.  I'm responsible for trapd maintenance.  Like
most tuning issues, picking an application queue size other than the
default is a trial-and-error business.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group


                                                                           
             "Bursik, Scott                                                
             {PBSG}"                                                       
             <Scott.Bursik@pbs                                          To 
             g.com>                    "'Nv-L (nv-l AT lists.us.ibm DOT com)'"  
  
             Sent by:                  <nv-l AT lists.us.ibm DOT com>           
  
             owner-nv-l@lists.                                          cc 
             us.ibm.com                                                    
                                                                   Subject 
                                       [nv-l] appl queue size question     
             05/25/2005 09:15                                              
             AM                                                            
                                                                           
                                                                           
             Please respond to                                             
                   nv-l                                                    
                                                                           
                                                                           




All,

I am having some performance issues with my production NetView server and
in
an effort to diagnose the issue I ran a script that someone from the forum
contributed a while back. It checks the appl queue.

When I run the script I get the following output:

Turning on trapd tracing
Starting tracing now....
Toggling trace mode of SNMP trap daemon
Waiting for one minute---------------------------|
.................................................
Stopping Tracing....
Toggling trace mode of SNMP trap daemon
Getting trapd status from /usr/OV/log/trapd.trace
Wed May 25 08:04:17 2005 send_to_all_appls: [0] appl queue size 1 of
maximum
5000 events
Wed May 25 08:04:17 2005 send_to_all_appls: [55042] appl queue size 1 of
maximum 5000 events

Should I be concerned with the last line? If I am reading this correctly I
am configured for a max of 5000 events and I have a queue size of 55042. I
would say that the appl queue size needs to be changed. We have a very
large
environment.


Here is the script so you can see what it is doing:

#!/usr/bin/ksh
clear
echo > /usr/OV/log/trapd.trace
echo "Turning on trapd tracing"
        echo ""
        echo ""
echo "Starting tracing now...."
/usr/OV/bin/trapd -T
        echo ""
        echo ""
# Progress indicator
while :; do
        sleep 1
        echo ".\c"
done &
Progress=$!
trap 9 15 "kill $Progress;exit 1"
echo "Waiting for one minute---------------------------|"
sleep 50
kill $Progress
        echo ""
        echo ""
echo "Stopping Tracing...."
/usr/OV/bin/trapd -T
        echo ""
        echo ""
echo "Getting trapd status from /usr/OV/log/trapd.trace"
tail /usr/OV/log/trapd.trace | grep "appl queue size" 
Thank You!

Scott Bursik





<Prev in Thread] Current Thread [Next in Thread>