RE: [nv-l] 7.1.4 Is the hang in nvtecia or ruleset?
2004-08-31 15:47:59
Drew -
Test fixes are cumulative.
IY60528 is the last fix in a chain.
It supersedes two earlier ones, IY57383 and IY56279, both of which
implement changes to nvserverd and related code. So rather
than force you to get IY57383 and put it on, and then IY56279 and put it
on, and then apply IY60528, the earlier ones were removed and only the
latest and final one is available. That's because once the code is
committed, if you have a problem, I can only go back as far as the last
committed level. So there is no point in providing a fix I cannot
support. So, unless you'd rather wait for 7.1.4 FP02, IY60528 is
what you should put on.
There was no public APAR for the event
hang you have been seeing. It was found in Verification and corrected
internally, and the first fix to actually ship it had an internal number,
130890. That went only a to a few select customers who reported
the same problem before we could build an official fix. And since
the code for IY57383 was ready at that time, the internal fix was replaced
by IY57383, and but that was soon superseded by IY56279, and that by IY60528.
And the readmes for test fixes only
describe what that specific fix is for. The code is cumulative but
the readme text is not. So IY60528's readme doesn't mention the earlier
problems and enhancements it also contains. As for what IY60528
actually adds to nvserverd, you reading way too much into the description.
IY60528 doesn't force you to do anything new. It does put
a new entry into the tecint.conf file, which if you uncomment it, will
cause new behavior from nvserverd, but otherwise not. IY56279 did
the same thing. All told, between them there are three new
entries in tecint.conf, all of which are commented out by default, which
you can use or not. And there is a brand-new, completely-rewritten
man page for tecint.conf which explains what those options do. But
whether you choose to use those new options or not, you need an nvserverd
built with an upgrade TEC EEIF library to solve the memory leak / hang
problem.
I hope this helps.
Feel free to ask more questions.
But don't be afraid of putting on IY60528. Unless things change
in the next 4 weeks, it will be what we ship in FP02.
Verification is running with it on their
systems right now, dozens of them. So you are not the first, and
certainly not alone.
James Shanks
Level 3 Support for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group
"Van Order, Drew \(US
- Hermitage\)" <dvanorder AT deloitte DOT com>
Sent by: owner-nv-l AT lists.us.ibm DOT com
08/31/2004 02:51 PM
|
To
| <nv-l AT lists.us.ibm DOT com>
|
cc
|
|
Subject
| RE: [nv-l] 7.1.4 Is the hang
in nvtecia or ruleset? |
|
Hope you're out there James--
I find myself stuck in the middle
between what you have stated and what the current support engineer is telling
me. I honestly think what you are telling me is more accurate, i.e. there
is an issue with EEIF and nvserverd that this efix resolves, even though
the abstract for the efix states it only enables NV by default sending
severity info to TEC via tecint.conf changes. I've asked support for clarification
but figure I can get an answer faster here.
I understand completely if you
are uncomfortable commenting on the current situation; my concern is putting
in a pretty big change (if NV severities are not set and ready to go for
ALL traps feeding TEC_ITS you are in for a surprise as it turns that on
and baroc severity slots are ignored. The abstract does not state that
if you don't tweak tecint.conf immediately events will not make it through
TEC) that doesn't address the core issue of events hanging up in the interface.
Thank you!--Drew
-----Original Message-----
From: owner-nv-l AT lists.us.ibm DOT com [mailto:owner-nv-l AT lists.us.ibm DOT com]
On Behalf Of James Shanks
Sent: Monday, August 30, 2004 7:40 AM
To: nv-l AT lists.us.ibm DOT com
Subject: Re: [nv-l] 7.1.4 Is the hang in nvtecia or ruleset?
Drew,
Do you have any post-FixPack1 maintenance applied? There was a serious
problem with the TEC EEIF library uncovered in Verification after 7.1.4
FixPack1 went out the door. It had a memory leak which caused
it to hang after some extended period of time. TEC has since fixed
the problem. Every post-7.1.4 FP01 test fix which shipped nvserverd,
and there were several, has the new TEC EEIF library linked in. If
you haven't applied anything yet, I would just download the latest and
greatest, IY60528, and install that until FP02 is available at the end
of the quarter.
Also, I would stay away from using the nvtecia command for the time being,
and just use ovstop/ovstart for nvserverd (Sorry). The command "nvtecia
-stop" seems to work OK, but "nvtecia -reload" is
hanging or failing to complete on most platforms. There seems to
be a re-initialization problem with State Correlation Engine after you
stop it, but don't destroy the process. I have some TEC guys looking
at that now, so stay tuned.
HTH
James Shanks
Level 3 Support for Tivoli NetView for UNIX and Windows
Tivoli Software / IBM Software Group
"Van Order, Drew \(US
- Hermitage\)" <dvanorder AT deloitte DOT com>
Sent by: owner-nv-l AT lists.us.ibm DOT com
08/29/2004 09:34 AM
|
To
| <nv-l AT lists.us.ibm DOT com>
|
cc
|
|
Subject
| [nv-l] 7.1.4 Is the hang
in nvtecia or ruleset? |
|
Hi all,
I'm back with another NV to TEC integration question/problem. Everything
is centered around TEC in our organization; everything we tell NV to act
upon is designed to generate a TEC event. We are seeing intermittent delays
in receiving TEC events through our TEC_ITS ruleset. Sometimes it will
resolve itself within an hour, other times a full daemon cycle is needed,
and NV is fine afterwards. The ruleset in use is the core TEC_ITS ruleset
provided with 7.1.4 and we have added 8 trap settings nodes to generate
events from our routers/switches/eventually Compaq Insight Agents. We have
no timer or decision making nodes, it's very basic.
The issue appears to be in ruleset processing or TEC forwarding. I say
this because when you ovstop/ovstart, events from the ruleset suddenly
appear at TEC. There are no trap storms occurring when these delays hit.
We are getting ready to manage considerably more devices and will have
to add even more trap nodes to TEC_ITS.rls, so we've got to get a handle
on why this is occurring. Any troubleshooting ideas or places where we
may have a configuration problem?
Thanks much--Drew
This message (including any attachments) contains confidential
information intended for a specific individual and purpose, and is protected
by law. If you are not the intended recipient, you should delete this message.
Any disclosure, copying, or distribution of this message, or the taking
of any action based on it, is strictly prohibited.
This message (including any attachments) contains confidential
information intended for a specific individual and purpose, and is protected
by law. If you are not the intended recipient, you should delete this message.
Any disclosure, copying, or distribution of this message, or the taking
of any action based on it, is strictly prohibited.
|
|
|