I'm Curious:since your holding your
node downs for 14 minutes, are you still doing 3 retries before considering
an IP to be down?
Also, what's the delay on your name
resolution like? you got thousands of those to do in a 5 minute poll cycle
right?
"Bursik, Scott {PBSG}"
<Scott.Bursik AT pbsg DOT com>
Sent by: owner-nv-l AT lists.us.ibm DOT com
03/15/2006 11:35 AM
Please respond to
nv-l AT lists.us.ibm DOT com |
|
To
| <nv-l AT lists.us.ibm DOT com>
|
cc
|
|
Subject
| RE: [nv-l] Ping is falling behind |
|
My CPU is running pretty high
constantly. Viewing topas here is what I see:
Name
PID CPU% PgSp Owner
nvcorrd
17328 32.5 9.6 root
netmon
54682 30.5 151.1 root
ipmap
45592 28.8 245.6 root
ovwdb
26948 4.5 310.6 root
ovw_binar
43258 1.5 319.9 root
Scott Bursik
From: owner-nv-l AT lists.us.ibm DOT com
[mailto:owner-nv-l AT lists.us.ibm DOT com] On Behalf Of Bursik, Scott
{PBSG}
Sent: Wednesday, March 15, 2006 10:28 AM
To: nv-l AT lists.us.ibm DOT com
Subject: RE: [nv-l] Ping is falling behind
Thank for the replies.
I don’t see anything obvious here. Do you?
ovwdb cache size?
[pbsxsn00001][/usr/OV/bin]>ps -ef | grep
ovwdb | grep -v grep
root 26948 32324 4
Mar 13 - 192:07 /usr/OV/bin/ovwdb -O -n130000
–t
ovobjprint -S
Number of objects defined in the database: 93939
Total number of fields defined in the database
is: 306.
Total number of field values in the database:
1958450
Number of Integer fields: 511561.
Number of Boolean fields: 754239.
Number of String fields: 521397.
Number of Enum fields: 171253.
netmon status polling by ICMP or SNMP or
both?
ICMP Polling only
[pbsxsn00001][/usr/OV/conf]>/usr/OV/bin/ovtopodump
-l|grep ^NUMBER
NUMBER OF NETWORKS: 11118
NUMBER OF SEGMENTS: 12594
NUMBER OF NODES: 22457
NUMBER OF INTERFACES: 37879
NUMBER OF GATEWAYS: 1337
[pbsxsn00001][/usr/OV/bin]>/usr/OV/bin/nvUtil
e "('isNode' = 'TRUE') && ('IP Status' = 'Unmanaged')"|wc
-l
9639
Scott Bursik
Enterprise Systems Management
PepsiCo Business Solutions Group
(972) 963-1400
scott.bursik AT pbsg DOT com
From: owner-nv-l AT lists.us.ibm DOT com
[mailto:owner-nv-l AT lists.us.ibm DOT com] On Behalf Of SCHIFFINGER Ralph
Sent: Wednesday, March 15, 2006 9:31 AM
To: 'nv-l AT lists.us.ibm DOT com'
Subject: AW: [nv-l] Ping is falling behind
ovwdb cache size?
netmon status polling by ICMP or SNMP or both?
let's compare some data ...
/usr/OV/bin/ovobjprint -S
Number of objects defined in the database: 51039
Total number of fields defined in the database is: 554.
Total number of field values in the database: 964960
Number of Integer fields: 285226.
Number of Boolean fields: 363786.
Number of String fields: 238995.
Number of Enum fields: 76953.
ps -ef | grep ovwdb | grep -v grep
root 3448850 4296824 0 Mar 14
- 5:28 /usr/OV/bin/ovwdb -O -n180000 -t
/usr/OV/bin/ovtopodump -l|grep ^NUMBER
NUMBER OF NETWORKS: 11235
NUMBER OF SEGMENTS: 10668
NUMBER OF NODES: 3682
NUMBER OF INTERFACES: 19857
NUMBER OF GATEWAYS: 2179
/usr/OV/bin/nvUtil e "('isNode'
= 'TRUE') && ('IP Status' = 'Unmanaged')"|wc -l
19
HTH Ralph S.
http://www.it-austria.com/
-----Ursprüngliche Nachricht-----
Von: Bursik, Scott {PBSG} [mailto:Scott.Bursik AT pbsg DOT com]
Gesendet: Mittwoch, 15. März 2006 16:14
An: nv-l AT lists.us.ibm DOT com
Betreff: [nv-l] Ping is falling behind
NetView 7.1.4 AIX 5.2
* I am using the script below to
check the ping status of NetView and I
am seeing numbers around 30,000 at times. I am having problems with
nodes going down for reboots and coming back up within about 5 minutes
and I am getting node down events. I have a timer rule in place that
says if I get a node down event hold it for 14 minutes. If I get a node
up event drop the node down. If the 14 minutes expires and no node up
event came in send the node down event.
* Well the nodes are coming back
well before the 14 minutes and we are
still getting the node down. I believe that NetView being behind in
polling is causing this. Are there any suggestions as to fixing the
polling falling behind issue?
* Within the last year SNMP was
enabled on our workstation machines
resulting in about 7000 new objects being discovered by NetView. Since
they have the same OID as the Windows servers they are brought into
discovery. I want to restrict them from being discovered by name. They
all have a common set of characters in the name. The naming convention
is ***wu****** or ***WU******. The same number of characters
are used.
If I read the documentation in the seed file correctly I could use this
login correct?
!???wu??????
!???WU?????
Am I interpreting the question marks
correctly from the documentation?
# Negative
Entries
# e.g. !10.1.1.2
Specific
entry
#
!10.*.1.1-100 Ranges
using * or -
#
!router*.tivoli.com Wildcards using *
and
?(single char)
#
!@oid 1.3.6.1.4.1.9.* Wildcards as final
char using *
#
(Note space after the
prefix "@oid ")
#
!@oid 0
This entry will
filter out all
#
non-SNMP supported
devices
* I have unmanaged as many devices
as I can that aren't being monitored
at the moment and have restricted discovery to several OID's. My DNS in
on another machine but on the same local subnet. Has anyone here written
a script to test the speed of resolution from DNS?
Here is the result of the ovobjprint
-S command:
ovobjprint -S
Number of objects defined in the database: 93939
Total number of fields defined in
the database is: 306.
Total number of field values in
the database: 1958450
Number of Integer fields: 511561.
Number of Boolean fields: 754239.
Number of String fields: 521397.
Number of Enum fields: 171253.
##############################script####################################
##########
#!/bin/ksh
#set -x
cat /dev/null > /usr/OV/log/netmon.trace
/usr/OV/bin/netmon -a 12
sleep 3
if [ -f /usr/OV/log/netmon.trace ]; then
echo "Netmon is " `grep [-].*[:] /usr/OV/log/netmon.trace
| wc -l `
"behind in status pinging";
else
echo "Netmon is too busy to report now. Try later."
fi
exit
##############################script####################################
##########
I know this is a lot of information
and I appreciate any and all
feedback I get.
Thank you,
Scott Bursik
PepsiCo
|