nv-l

Re: [nv-l] netfmt

2004-05-10 16:26:42
Subject: Re: [nv-l] netfmt
From: Mahesh Tailor <mahesh.tailor AT network.carilion DOT com>
To: NetView User List <nv-l AT lists.us.ibm DOT com>
Date: 10 May 2004 16:13:14 -0400
Hi, James!

Here's the output of my ps -ef:

root@netview [/usr/OV/log] # ps -ef | grep netfmt
root     28067     1  0 15:45 ?        00:00:00 netfmt -CF
root     23113     1  0 15:48 ?        00:00:00 netfmt -CF
root     23748     1  0 15:48 ?        00:00:00 netfmt -CF
root     24536     1  0 15:48 ?        00:00:00 netfmt -CF
root      2472  2471  0 15:49 ?        00:00:00 netfmt -CF
root      8020  9132  0 15:53 pts/0    00:00:00 grep netfmt
root@netview [/usr/OV/log] # ps -ef | grep 2471
root      2471     1  0 15:49 ?        00:00:00 /usr/OV/bin/ntl_reader 0
1 1 1 1
root      2472  2471  0 15:49 ?        00:00:00 netfmt -CF
root      8018  9132  0 15:53 pts/0    00:00:00 grep 2471

And, these are since I had to restart my machine 50-minutes ago.

I performed a nettl -stop and still had the netfmt processes belonging 
to PID 1 running; killed them.  Restarted nettl.

Here're some of the nettl log messages . . .


************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 10:06:07.308834
  Process ID           : 9774               Subsystem        : SECURITY
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ovw
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
OVwUserSecurity() error 4 on waitpid
                                                                                
                                                                       
************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 15:08:45.118009
  Process ID           : 1609               Subsystem        : OVW
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ipmap
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError =
80): Object not found.
                                                                                
                                                                       
************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 15:08:45.118101
  Process ID           : 1609               Subsystem        : OVW
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ipmap
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed to create symbol: 172.23.6.25.  OVwError =80: Object not found.
                                                                                
                                                                       
************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 15:08:45.118763
  Process ID           : 1609               Subsystem        : OVW
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ipmap
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
IPMap error in symbolMgr::flushSymbols - OVwCreateSymbols - (OVwError =
80): Object not found.
                                                                                
                                                                       
************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 15:08:45.118822

  Process ID           : 1609               Subsystem        : OVW
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ipmap
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Failed to create symbol: 10.10.10.10.  OVwError =80: Object not found.
                                                                                
                                                                       
************************************ NetView
*******************************@#%
  Timestamp            : Mon May 10 2004 15:17:38.349803
  Process ID           : 1394               Subsystem        : OVS
  User ID ( UID )      : 0                  Log Class        : ERROR
  Device ID            : -1                 Path ID          : -1
  Connection ID        : -1                 Log Instance     : 0
                                                                                
                                                                       
  Software             : /usr/OV/bin/ovspmd
  Hostname             : netview.carilion.com
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Object manager kronos.carilion.com is not registered.  See ovaddobj(1m).


Kronos.carilion.com is 10.10.10.10 which is a Win2K cluster address and
is excluded as !kronos.carilion.com in netmon.seed.

If you see something obvious can you please drop me a reply.  If not, I
will submit a PMR.

Thanks.

Mahesh



On Mon, 2004-05-10 at 15:41, James Shanks wrote:
> Well, I don't have a clue what is wrong, but on Linux, it is the nettl
> process itself which spawns the netfmt -CF.  But only one of those is
> spawned on my system and it stays active only so long as nettl is
> active.  When I do a  "/usr/OV/bin/nettl -stop"  both nettl and the
> netfmt go away.
> 
> You should be able to chase ownership of the process via ps -ef.   Who
> is | are the parents of these rogue netfmts?  Your current nettl or
> some other long gone?  What happens when or if you do nettl -stop?  
> Once the main nettl goes away, you should be able to kill those netfmt
> processes with impunity, though that will not tell you why they are
> being created.  But you can stop and restart nettl any time you wish.
> Normally it is just started once and keeps running until stopped.   If
> you stop nettl and kill all the remaining netfmts, if any, and then
> restart nettl with nettl -start, try looking with "ps -ef  |grep
> netfmt".  How many do you see? Should be just one.  Try looking again
> every few minutes.
> 
> Offhand I see nothing in your status that looks out of line.  Where
> would you look for a source of the problem?  Well, I'm not sure, since
> I've never seen anything like this before, but here's what I'd do:
> (1) /usr/OV/bin/nettl -stop
> (2) ps -ef  | grep netfmt.  kill any you find
> (3) cd /usr/OV/log
> (4) ls nettl*    and see how many you have, just netttl.LOG00 or also
> nettl.LOG01
> (5) for each nettl.LOG0n you have, issue
>         /usr/OV/bin/netfmt -f  nettl.LOG0n  >  formatted.LOG0n 
>         This creates ascii files you can read.  
> (6) Look in the formatted logs for interesting error messages
> (7) Call Support with what you find.
> 
> James Shanks
> Level 3 Support  for Tivoli NetView for UNIX and Windows
> Tivoli Software / IBM Software Group
> 
> 
> Mahesh Tailor
> <mahesh.tailor AT network.carilion DOT com>
> Sent by:
> owner-nv-l AT lists.us.ibm DOT com
> 
> 05/10/2004 03:01 PM
>          Please respond to
>                nv-l
>                To
> NetView User List
> <nv-l AT lists.us.ibm DOT com>
>                cc
> 
>           Subject
> [nv-l] netfmt
> 
> 
> 
> 
> Hi!
> 
> Running NetView 7.1.3 fp 2 on RedHat Linux AS 2.1.
> 
> I am having a problem with hundreds of netfmt -CF processes running
> and
> eventually disabling the system because of too many open files [system
> default open files has been set to 32K files].  How can I figure out
> what is causing all these processes to start?  Here's my nettl status
> output:
> 
> Logging Information:
> Log Filename:                   /usr/OV/log/nettl.LOG0x
> User's ID:              0       Buffer Size:            8192
> Messages Dropped:       0       Messages Queued:        0
> 
> Subsystem Name:                 Log Class:
> NON_IP                                                     ERROR
> DISASTER
> DISTMAN                                            WARNING ERROR
> DISASTER
> SECURITY                                           WARNING ERROR
> DISASTER
> COLLECTION                                         WARNING ERROR
> DISASTER
> SNMP                                                       ERROR
> DISASTER
> CMOT                                                       ERROR
> DISASTER
> OVE                                                        ERROR
> DISASTER
> OVC                                                        ERROR
> DISASTER
> OVW                                                        ERROR
> DISASTER
> OVD                                                        ERROR
> DISASTER
> OVS                                    INFORMATIVE         ERROR
> DISASTER
> OVCAPI                                                     ERROR
> DISASTER
> OVEXTERNAL                                                 ERROR
> DISASTER
> OVWAPI                                                     ERROR
> DISASTER
> TEST_ID_1                                                       
> DISASTER
> TEST_ID_2                                                       
> DISASTER
> FORMATTER                                                       
> DISASTER
> 
> 
> Tracing Information:
> 
> Trace Filename:
> No Subsystems Active
> 
> 
> In addition to NetView the server also has the following running:
> 
> - MySQL DB
> - Apache w/PHP and Perl.
> - Some ksh scripts that perform /usr/OV/bin/nvUtil on various
> smartsets
> once every 30-minutes.
> 
> That is essentially it.
> 
> Also, what does the netfmt -C option do?  It is not in the man page.
> 
> Thanks.
> 
> Mahesh
-- 
Mahesh Tailor
WAN/TSM/NetView Administrator
Carilion Health System
Information Services
37 Reserve Avenue
Roanoke, VA 24016
Phone: 540.224.3929
Fax: 540.224.3954



<Prev in Thread] Current Thread [Next in Thread>