nv-l

[nv-l] Ruleset Correlation

2004-05-28 10:29:19
Subject: [nv-l] Ruleset Correlation
From: "Barr, Scott" <Scott_Barr AT csgsystems DOT com>
To: <nv-l AT lists.us.ibm DOT com>
Date: Fri, 28 May 2004 09:08:52 -0500
Greetings - NetView 7.1.3 & Solaris 2.8
 
I am working through some automation performance issues and I observed something disturbing. I have automation that receives SNA mainframe events, parses and formats the trap and writes it to a log. It also uses snmptrap to generate a psuedo "node down" trap. When a corresponding up event is received for the same SNA device I use snmptrap to send an "up" event. A second ruleset performs correlation on the up and down events so that if the duration between the up and down events is less than 10 minutes, it gets tossed, otherwise a notification script is called that wakes up the help desk.
 
What disturbs me is the behavior I see when we have a significant outage - in my sample case, 34 SNA devices dropped at one time. When the corresponding up messages occured, everything worked properly except the notifications. The duration of the outage exceeded the time in pass on match/resset on match timers but only 12 up notifications occured. According to my application log and trapd.log, the  34 "up" events got generated but the notifications did not. What I am wondering is whether there is a limit to the number of outstanding correlated events, i.e. how many devices can be waiting for a node up? Is it possible only 12 pairs of node down/ups can be outstanding? Is there a way to look at whave events automation (and I'm not sure if it's nvcorrd, actionsvr or ovactiond thats involved) still has outstanding?
 
<Prev in Thread] Current Thread [Next in Thread>