This is a multipart message in MIME format. I am going to assume that the rulesets you have in the 6.0.3 system are
identical to those in the 7.1 system. If not, all bets are off.
The nvserverd "not running" message is the result of a time out. If
nvevents, the event window, cannot talk to the nvserverd daemon, he issues
that message. If ovstatus shows the daemon up, and ps -ef does too, then
he is stalled doing something else. If you are forwarding to TEC, the
thing to do might be to check and see if your TEC server has gone down,
since nvserverd is the guy who forwards to TEC when you use the internal
adapter. I would also check the /etc/Tivoli/tec/cache file and see if it
is growing. If it is, then that means we cannot contact the TEC server
for some reason, and nvserverd is having to try to reconnect with him on
every event it gets, vastly slowing things down.
James Shanks
Level 3 Support for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group
Jorge Jiles <Jorge.Jiles AT ualberta DOT ca>
Sent by: owner-nv-l AT tkg DOT com
12/14/2001 10:50 AM
Please respond to IBM NetView Discussion
To: IBM NetView Discussion <nv-l AT tkg DOT com>
cc:
Subject: RE: [NV-L] Nvcorrd error checking
I have seen the same problem in my system. Netview 7.1 on Solaris 8. What
is really weird is that at times I also get the message that nvserverd is
not running when according to ovstatus and logs is working OK. The Only
way
to get going the events (and the scripts called by rulesets) is by
stop/start nvcorrd, and nvserverd. More than once I had to stopped all the
daemons and restart them. If I find any explanation for this, I let you
know.
I don't think the rulesets are the problems as the same ones are running
in
a production environment AIX, Netview 6.02 and they work properly.
At 01:55 PM 12/14/2001 +1100, you wrote:
>Thanks for the tip James, I will look into it.
>
>We are running 24X7 for some boxes, so I am thinking of some sort of
>heartbeat to check the trapd, nvcorrd and nvactiond e.g(wsnmptrap to NV
-->
>script to postemsg to TEC --> TEC touch a local file --> script to check
>touch time of file)
>
>How are Netview experts check their daemons out there? Thanks!
>
>Regards,
>
>Jack
>
>-----Original Message-----
>From: James Shanks [mailto:jshanks AT us.ibm DOT com]
>Sent: Friday, 14 December 2001 2:21 p.m.
>To: IBM NetView Discussion
>Subject: Re: [NV-L] Nvcorrd error checking
>
>
>Jack -
>
>Try looking in the logs. nvcorrd writes to an alog and a blog in
>/usr/OV/log. Errors are always written there. nvcorrd always starts in
the
>alog, writes a 1000 lines, switches to blog, then writes another 1000,
and
>switches back. And when there is an action to be run he hands that off
to
>actionsvr, who also has a pair of logs, nvaction.alog and blog, that
>work the same way. If you still don't see anything, then you can turn
on
>tracing, using the command "nvcdebug -d all". There are man pages on all
>this stuff, as well as lengthy discussions in the Admin Guide about how
it
>works.
>
>If you read the NetView Diagnosis Guide, you may find more hints. There
>you will learn that "Well-behaved" does not mean that the daemon is
working.
>It is a static condition which reflects how it was built, not whether it
is
>running correctly at this particular time. A well-behaved daemon goes
down
>when you do ovstop. One that is "non-well-behaved" stays up even after
the
>others go away.
>
>James Shanks
>Level 3 Support for Tivoli NetView for UNIX and NT
>Tivoli Software / IBM Software Group
>
>
>
>
>
>
> "Chan, Jack"
>
> <[email protected] To: "'IBM NetView
Discussion'"
>
> nisys.com> <nv-l AT tkg DOT com>
>
> Sent by: cc:
>
> owner-nv-l@tkg. Subject: [NV-L] Nvcorrd
error
>checking
> com
>
>
>
>
>
> 12/13/01 06:45
>
> PM
>
> Please respond
>
> to IBM NetView
>
> Discussion
>
>
>
>
>
>
>
>
>Hello List,
>
>I am having a problem with nvcorrd daemon. Problem as follows:
>
>I have a NV rule to execute a script upon receiving a trap.
>I checked the trapd.log for the trap, it is there, but the script did not
>execute.
>
>ovstatus shows nvcorrd (and all the daemons) are RUNNING and well
behaved.
>Another symptom I see is the control desktop is not updating (through
exceed
>and Linux console as well). After I ovstop and ovstart, the script is
>executing again.
>
>I have DM profile to check for daemon up, and scripts to do ovstatus
|grep
>RUNNING and ovstatus |grep OVs_WELL_BEHAVED. But both of these checking
>mechanism are NOT picking up the nvcorrd is not working as it is supposed
to
>(because it still thinks it is RUNNING and well behaved)
>
>How can I check that nvcorrd is REALLY running? Some sort of heartbeat
using
>ruleset maybe?
>
>regards,
>
>Jack.
>
>
>_________________________________________________________________________
>NV-L List information and Archives: http://www.tkg.com/nv-l
>_________________________________________________________________________
>NV-L List information and Archives: http://www.tkg.com/nv-l
>
Jorge A Jiles
Network Analyst
Computing & Network Services
University of Alberta
Edmonton, Alberta
Canada
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
I am going to assume that the rulesets you have in the 6.0.3 system are identical to those in the 7.1 system. If not, all bets are off.
The nvserverd "not running" message is the result of a time out. If nvevents, the event window, cannot talk to the nvserverd daemon, he issues that message. If ovstatus shows the daemon up, and ps -ef does too, then he is stalled doing something else. If you are forwarding to TEC, the thing to do might be to check and see if your TEC server has gone down, since nvserverd is the guy who forwards to TEC when you use the internal adapter. I would also check the /etc/Tivoli/tec/cache file and see if it is growing. If it is, then that means we cannot contact the TEC server for some reason, and nvserverd is having to try to reconnect with him on every event it gets, vastly slowing things down.
James Shanks
Level 3 Support for Tivoli NetView for UNIX and NT
Tivoli Software / IBM Software Group
| Jorge Jiles <Jorge.Jiles AT ualberta DOT ca>
Sent by: owner-nv-l AT tkg DOT com
12/14/2001 10:50 AM
Please respond to IBM NetView Discussion
|
To: IBM NetView Discussion <nv-l AT tkg DOT com>
cc:
Subject: RE: [NV-L] Nvcorrd error checking
|
I have seen the same problem in my system. Netview 7.1 on Solaris 8. What
is really weird is that at times I also get the message that nvserverd is
not running when according to ovstatus and logs is working OK. The Only way
to get going the events (and the scripts called by rulesets) is by
stop/start nvcorrd, and nvserverd. More than once I had to stopped all the
daemons and restart them. If I find any explanation for this, I let you know.
I don't think the rulesets are the problems as the same ones are running in
a production environment AIX, Netview 6.02 and they work properly.
At 01:55 PM 12/14/2001 +1100, you wrote:
>Thanks for the tip James, I will look into it.
>
>We are running 24X7 for some boxes, so I am thinking of some sort of
>heartbeat to check the trapd, nvcorrd and nvactiond e.g(wsnmptrap to NV -->
>script to postemsg to TEC --> TEC touch a local file --> script to check
>touch time of file)
>
>How are Netview experts check their daemons out there? Thanks!
>
>Regards,
>
>Jack
>
>-----Original Message-----
>From: James Shanks [mailto:jshanks AT us.ibm DOT com]
>Sent: Friday, 14 December 2001 2:21 p.m.
>To: IBM NetView Discussion
>Subject: Re: [NV-L] Nvcorrd error checking
>
>
>Jack -
>
>Try looking in the logs. nvcorrd writes to an alog and a blog in
>/usr/OV/log. Errors are always written there. nvcorrd always starts in the
>alog, writes a 1000 lines, switches to blog, then writes another 1000, and
>switches back. And when there is an action to be run he hands that off to
>actionsvr, who also has a pair of logs, nvaction.alog and blog, that
>work the same way. If you still don't see anything, then you can turn on
>tracing, using the command "nvcdebug -d all". There are man pages on all
>this stuff, as well as lengthy discussions in the Admin Guide about how it
>works.
>
>If you read the NetView Diagnosis Guide, you may find more hints. There
>you will learn that "Well-behaved" does not mean that the daemon is working.
>It is a static condition which reflects how it was built, not whether it is
>running correctly at this particular time. A well-behaved daemon goes down
>when you do ovstop. One that is "non-well-behaved" stays up even after the
>others go away.
>
>James Shanks
>Level 3 Support for Tivoli NetView for UNIX and NT
>Tivoli Software / IBM Software Group
>
>
>
>
>
>
> "Chan, Jack"
>
> <[email protected] To: "'IBM NetView Discussion'"
>
> nisys.com> <nv-l AT tkg DOT com>
>
> Sent by: cc:
>
> owner-nv-l@tkg. Subject: [NV-L] Nvcorrd error
>checking
> com
>
>
>
>
>
> 12/13/01 06:45
>
> PM
>
> Please respond
>
> to IBM NetView
>
> Discussion
>
>
>
>
>
>
>
>
>Hello List,
>
>I am having a problem with nvcorrd daemon. Problem as follows:
>
>I have a NV rule to execute a script upon receiving a trap.
>I checked the trapd.log for the trap, it is there, but the script did not
>execute.
>
>ovstatus shows nvcorrd (and all the daemons) are RUNNING and well behaved.
>Another symptom I see is the control desktop is not updating (through exceed
>and Linux console as well). After I ovstop and ovstart, the script is
>executing again.
>
>I have DM profile to check for daemon up, and scripts to do ovstatus |grep
>RUNNING and ovstatus |grep OVs_WELL_BEHAVED. But both of these checking
>mechanism are NOT picking up the nvcorrd is not working as it is supposed to
>(because it still thinks it is RUNNING and well behaved)
>
>How can I check that nvcorrd is REALLY running? Some sort of heartbeat using
>ruleset maybe?
>
>regards,
>
>Jack.
>
>
>_________________________________________________________________________
>NV-L List information and Archives: http://www.tkg.com/nv-l
>_________________________________________________________________________
>NV-L List information and Archives: http://www.tkg.com/nv-l
>
Jorge A Jiles
Network Analyst
Computing & Network Services
University of Alberta
Edmonton, Alberta
Canada
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
|