nv-l

Re: [nv-l] T netmon-related Application reached maximum number of outstanding events, disconnecting from trapd.

2003-10-27 17:33:30
Subject: Re: [nv-l] T netmon-related Application reached maximum number of outstanding events, disconnecting from trapd.
From: James Shanks <jshanks AT us.ibm DOT com>
To: nv-l AT lists.us.ibm DOT com
Date: Mon, 27 Oct 2003 17:24:34 -0500

Oh yes.  The "don't log" occurs only after the trap has been completely processed.  The "don't display" happens only in nvevents -- the event window still gets that trap as do any other trap receivers (nvcorrd, nvserverd, etc )who are registered for it (in fact you can open a dynamic window to display all those you marked as "don't log or display"!).  All the overhead is incurred before they are suppressed.   Everybody sees them except the humans.

If you want to filter traps out so that they don't get processed by trapd or any other  standard NetView daemons, then you should implement MLM.  midmand has a trap destination table and filter mechanism that let's you  toss away anything you don't want there and forward only what you do want to trapd.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group



"Alan E. Hennis" <Hennis_Alan_E AT cat DOT com>
Sent by: owner-nv-l-digest AT lists.us.ibm DOT com

10/27/2003 04:56 PM
Please respond to nv-l

       
        To:        nv-l AT lists.us.ibm DOT com
        cc:        
        Subject:        Re: [nv-l] T netmon-related Application reached maximum number of outstanding events, disconnecting from trapd.




James

Again, another very clear and concise explanation.

Since I am not running anything other than a standard NetView install right
out of the box I am going to assume that my box is being flooded with traps
from some rouge device on my network. Would it be a correct guess that even
though I have some traps set to don't log or display that they still need
to be processed and therefore can cause the queue to overrun.


Thanks
Alan E. Hennis
Caterpillar Inc.
Systems+Process Division
309.494.3308
hennis_alan_e AT cat DOT com


                                                                                                                                     
                     James Shanks                                                                                                    
                     <jshanks AT us.ibm DOT com>                                                                                            
                     Sent by:                                                                                                        
                     owner-nv-l-digest@lists                                                                                          
                     .us.ibm.com             To: nv-l AT lists.us.ibm DOT com                                                                
                                             cc:                                                                                      
                                                                                                                                     
                     10/27/2003 02:43 PM                                                                                              
                     Please respond to nv-l                                                                                          
                                               Subject:      Re: [nv-l] T netmon-related Application reached maximum number of        
                                                     outstanding events, disconnecting from trapd.                                    
                                                                                                                                     



Caterpillar: Confidential Green                 Retain Until: 11/26/2003
                                               Retention Category:  G90 -
                                               General
                                               Matters/Administration





The first thing to notice is that a netmon-related application may or may
not be netmon himself.  It might be another trap receiver using the same
API as netmon.
But what it means is the same no matter whom it is about.

That trap is issued by trapd whenever he forces a connected application to
disconnect.  Every connected application gets an internal queue for trapd
to put events on when they need to be sent.   When an application cannot
process the traps sent to him fast enough to keep up with the rate at which
they are being processed by trapd, the queue will grow.  When the queue
reaches the maximum size, trapd forcibly disconnects that application to
save himself from running out of storage.  The queue is emptied when the
application is disconnected.  The default size of this queue in current
code is 2000.    You can increase this in serversetup  (application queue
buffer size).   How big you should make it is a tuning and performance
issue.  All connected applications (snmpcollect, ipmap, and so on) get the
same size, whatever it is.  So you are using more system memory by raising
the limit.

If this disconnection is a common occurrence, then you should increase the
queue size.  How big can you go?  Well, I have seen people run with sizes
ten times as high (20000), but this has its own disadvantages.  A larger
queue size will allow the application to stay connected and possibly
recover from whatever is slowing him down.  But the application will then
still have to process all of those traps, sooner or later.  He could be
behind for a very long time.

You can see how all this is working by running the trapd trace.  You can
toggle that on and off from the command line using "trapd -T" and you'll
see the application queues being written to and deleted from.  That will
give you some idea of what normal processing is for you.  The PID of the
external process is given in the trace so you can see who the players are
and who is getting behind before the disconnect occurs.

James Shanks
Level 3 Support  for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group

                                                                         
  "Alan E. Hennis"                                                        
  <Hennis_Alan_E AT cat DOT co         To:        nv-l AT lists.us.ibm DOT com          
  m>                            cc:                                      

   Sent by:                      Subject:        [nv-l] T netmon-related  
  owner-nv-l-digest@lis Application reached maximum number of outstanding
  ts.us.ibm.com         events, disconnecting from trapd.                
                                                                         
                                                                         
  10/27/2003 11:01 AM                                                    
  Please respond to                                                      
  nv-l                                                                    
                                                                         






NV 7.1.3 FP1 RedHat 7.2

Has anyone ever seen this trap?

Mon Oct 27 09:57:50 2003 <none>          T netmon-related Application
reached maximum number of outstanding events, disconnecting from trapd.


Here is the description from trapd.conf

This event is generated by IBM Tivoli NetView when
it detects a fatal error

The data passed with the event are:
  1) ID of application sending the event
  2) Name or IP address
  3) Formatted description of the event


Thanks
Alan E. Hennis
Caterpillar Inc.
Systems+Process Division
309.494.3308
hennis_alan_e AT cat DOT com