nv-l

Re: Ruleset Development: First time, complex MIB

2000-03-03 10:17:19
Subject: Re: Ruleset Development: First time, complex MIB
From: James Shanks <James_Shanks AT TIVOLI DOT COM>
To: nv-l AT lists.tivoli DOT com
Date: Fri, 3 Mar 2000 10:17:19 -0500
John -

I will comment, but you may not pleased with what I have to say.  Please bear
with me and I will try to help.

My first question may seem pedantic.  Have you seen these traps before?  Do you
know for a fact what will be in all these variables you mention?  I have two
reasons for asking.

First, traps do not come from a MIB, though they may be defined in one.  Instead
they come from an agent, and agents do not always do what their MIBs say they
do.   You need to make certain that the variables really do contain what you
think they do and that your matches will work.

Second, I ask because you must also define all of these traps to trapd, or they
will all show up on the NetView operator's screen with the message "NO FMT FOUND
for ...".   But the ruleset will still handle them even if the NetView operators
are puzzled.  You should plan on running mib2trap against the MIB which defines
all these traps, and then running the script output of that to configure trapd.
It will also produce a baroc file for use with TEC (and you can modify that if
you like as well as the slot mappings of the traps themselves).

Next, the processing you have described will be cpu intensive and you say it
will be added to your current TEC forwarding ruleset, yet you do not describe
that.  It will be impossible to tell what the impact will be of adding all this
additional code without knowing what you are currently doing and, just as
important, what your trap rate is.  But it is my guess, that like all
programmers (and make no mistake, the ruleset editor is a programming tool), you
will have to write your code first and check its performance later.  You should
plan on running the nvcorrd trace, nvcdebug -d all, after you are done, so that
you can see just how long it takes for a trap to pass through this ruleset (and
indeed all the rulesets you have running).  In the nvcorrd log, after the
tracing is turned on, you will see a line which reads "Received a trap"  and
then there will be many, many lines of processing, and then "Finished with the
trap".  These will be time-stamped, so you will be able to tell about how many
events you can handle per second until you start falling behind.  From the sound
of things it will not be many.

 Third, I would advise you to take an incremental approach to implementing this
new ruleset.  I would not start by trying to code up the final product all at
once.  Rather I would code this up for a couple of Up/Down pairs and implement
that, and then test it with snmptrap to make sure it works.  Choose some traps
you can make happen on a test piece of hardware and see how it goes.  Then start
adding more and more traps.  Divide and conquer, John.  I have see too many
rulesets fail because they were too complex and their designers didn't test them
along the way.

 Now, these remarks are actually in answer to your question #5, so let me take a
shot at #1 through #4

>>1) Should this work as far as forwarding the down traps? We do this already
>>for other event sources, so I have high confidence in that.

I agree.  This sounds like it should work, if I understand you correctly.

>>2) Will the Up traps that arrive later than 10 minutes be forwarded? If not,
>>how's the best way to handle this? I'm really trying to avoid having
>>to code all of these timer requirements in T/EC.

John, the Up traps will not be forwarded AT ALL.  You have not written any code
to forward them.  The slot 2 trap in a ruleset is not passed on unless you pass
it on.  It is simply used as a go/no go filter for the trap in slot 1, which is
held in a cache.  You will have to connect the slot 2 Up traps to the forward to
have them be sent, and then yes, all of them, whether they match an event in the
cache or not, will be forwarded.

>>3) Will reset on match blow up if a trap variable for comparison doesn't
>>exist? The trap pairs have 6, 7, or 8 variables, and I'd rather not
>>have to break it out into separate groups if possible.

Hmmm.  I don't know.  I think this will work but I can't say I have tried it or
seen it done.

>>4) Am I making a bad assumption grouping all of Down traps and Up traps
>>together? My justification is that all the matching attributes should
>>only succeed  for the Up trap, and not the 19 other Up traps should they
>>happen to come in. Am I playing it too fast and loose here?

This is why I asked you if you had actually seen the traps you were coding for.
If you have not, then you cannot be sure what they look like, and it is risky
business trying to code a processing routine for data you haven't seen.  Oh
sure, big projects are often done this way, but then you have to be prepared to
alter one side or the other (the data generating side or the data reception
side) to match if the specs were not right.  Since you cannot control what the
agent puts in the trap, I would suggest you try to force the agent to generate
them before you code for them.  Then you know what you are dealing with.

Good luck,  This is certainly ambitious.

Now I would like to propose what may be a simper alternative.  If I read this
right, you will still have to add code to TEC to close the down traps which had
no match when the up traps arrive after ten minutes.  So the whole point of this
code in NetView is just to reduce the number of up/down pairs sent to TEC.  But
then I have to ask, is each of these 20 pairs of up/down conditions equally
likely?    If not, then you may be building a very complex NetView ruleset for
little gain.  Suppose that only three of these conditions is likely to occur and
the other 17 are very infrequent?  Why not just let TEC handle the 17 and leave
them out of the ruleset altogether?  I would only add as much code to your
ruleset as was needed to have a significant impact on TEC.  It's the old 80/20
rule -- the last 20 per cent of the conditions you will encounter will take 80
per cent of the code to handle.  Therefore, I advise caution and stepwise
development.

James Shanks
Tivoli (NetView for UNIX and NT) L3 Support



"Austin, Jon (FUSA)" <JonAustin AT FIRSTUSA DOT COM> on 03/03/2000 08:31:24 AM

Please respond to Discussion of IBM NetView and POLYCENTER Manager on NetView
      <NV-L AT UCSBVM.UCSB DOT EDU>

To:   NV-L AT UCSBVM.UCSB DOT EDU
cc:    (bcc: James Shanks/Tivoli Systems)
Subject:  Ruleset Development: First time, complex MIB




I've just subscribed to the list, so if anything I ask show's I'm a clueless
newbie please tactfully correct me.

I've primarily worked on the Tivoli Framework/DM/TEC side of integration's,
but I'm integrating traps from a MIB end-to-end, from NetView to
Expert Advisor (TSD Problem Management). I've got a product MIB with 106
traps and I don't have control of the SNMP configuration at the
source. I do want to pass through to T/EC 42 of the traps. Of these 42, 2
are to be forwarded immediately, and the rest are pairs of Down/Up
traps for various components. The Down/Up pairs should clear themselves in
NetView if the Up trap arrives in 10 minutes. However, if the Up
trap arrives any later, it should be forwarded to T/EC where it will
downgrade the severity of the Down event and EA Ticket. The MIB is written
with
6 variables that are always present and the occasional 7th or 8th to
precisely identify the component and problem. Variables 1,2,3 can pretty
much uniquely identify an component in conjunction with Variables 7 & 8 when
they're present.

Here's my first idea for the ruleset:

1st Node is Event Attribute for the MIB
  3 Trap settings -
        Settings 1 - all 20 of the Down traps
        Settings 2 - all 20 of the Up traps
        Settings 3 - the 2 traps that should forward immediately
      A Reset on Match with Trap Settings 1 on Slot 1 and Trap Setting 2 on
Slot 2. The delay is set to 10 minutes, and the
          attributes to match on are 1,2,3,7,8
     The Reset on Match and Trap Settings 3 connect to a Collection Query
where we test for the origin being in an 'Important Nodes' collection.
         The Query connects to a forwarding

Here's my questions:

1) Should this work as far as forwarding the down traps? We do this already
for other event sources, so I have high confidence in that.
2) Will the Up traps that arrive later than 10 minutes be forwarded? If not,
how's the best way to handle this? I'm really trying to avoid having
to code all of these timer requirements in T/EC.
3) Will reset on match blow up if a trap variable for comparison doesn't
exist? The trap pairs have 6, 7, or 8 variables, and I'd rather not
have to break it out into separate groups if possible.
4) Am I making a bad assumption grouping all of Down traps and Up traps
together? My justification is that all the matching attributes should
only succeed  for the Up trap, and not the 19 other Up traps should they
happen to come in. Am I playing it too fast and loose here?
5)This ruleset will be imported into our current T/EC forwarding ruleset. Is
this an example of well-written code (for function/performance) or have
I missed something?

I would really appreciate any evaluation or expertise that anyone is able to
pass along. I'm on the Tivoli TME 10 Digest and there we have
some acknowledged experts in T/EC (i.e. I.V.Blankenship and Paul Roscoe) who
are very willing to nurture skills in T/EC rule writing that exceed
what's in the Tivoli manuals.  I hope the same exists for NetView....


Jon Austin
Technical Services - Enterprise Management
First USA Bank
Phone: 302-282-3498
Pager: 877-573-9542
JonAustin AT FirstUSA DOT com


<Prev in Thread] Current Thread [Next in Thread>