nv-l

Re: [NV-L] TEC Event Forwarding Intermittently Failing on 7.1.5

2006-12-14 18:31:22
Subject: Re: [NV-L] TEC Event Forwarding Intermittently Failing on 7.1.5
From: James Shanks <jshanks AT us.ibm DOT com>
To: Tivoli NetView Discussions <nv-l AT lists.ca.ibm DOT com>
Date: Thu, 14 Dec 2006 18:29:56 -0500

You need to open a problem to TEC and have them help you. Or you can open that problem to NetView and have them get TEC involved. But one way or another it appears to me that you'll need to get a TEC internal trace. You do that by installing what's called an ed_diag_config file. This file is similar to the nv6k_tecad.err file. It lists trace levels and trace points that the TEC internal code will respond to when it executes.

You install that file somewhere and then you add an entry to the tecad_nv6k.conf describing it's location, and then restart the adapter. TEC Level 2 will tell you how to do it, but basically, if you put the file in \usr\ov\conf, then you'd add this to tecad_nv6k.conf:
ed_diag_config_file=\usr\ov\conf\ed_diag_config

Since the code you are tracing is TEC's and not NetView's, it is the TEC people who will have to read the trace.

Let me try to explain why this is and how this works. All TEC event adapters, no matter who writes 'em, work the same way. There's an API in described in the TEC Reference manual. First you establish a session with the TEC server by calling tec_create_handle. When you call this routine the TEC library code opens a persistent session to the server. Then your adapter code formats the event and stuffs it into a buffer. Then it calls tec_put_event with the handle it got from the create and the address of that buffer. tec_put_event is another TEC library routine: it's the API to the sending function. If it does not give a bad return code, the event now is out of the adapter's control and the low-level TEC library routines take over. They do the actual sending. If they can't get through, they cache the event. And with the advent of the SCE, they first hand it off to the java engine, who hands it back when he's done with it, and then they actually try to send it.

Do you see where I'm going with this?

The nvtecad.log has a NetView TECIO trace that says that the events were given to the TEC library code. That's what happens between the "Sending event to T/EC ..." and the "Event sent to T/EC" messages. If we got a bad return code you'd see a different message, other than "Event sent to T/EC". There be some error or failure message instead. So the trace basically says that NetView gave the events to the TEC library and now it's up to the TEC code to send along. If they aren't going to the server, then you'll need a TEC trace of some kind for TEC support to tell you why.

Best I can do.


James Shanks
Level 3 Support for Tivoli NetView for UNIX and Windows
Network Availability Management
Network Management - Development
Tivoli Software, IBM Corp
Inactive hide details for "Cooprider, Eric" <ecooprider AT ycsd.york.va DOT us>"Cooprider, Eric" <ecooprider AT ycsd.york.va DOT us>


          "Cooprider, Eric" <ecooprider AT ycsd.york.va DOT us>
          Sent by: nv-l-bounces AT lists.ca.ibm DOT com

          12/14/2006 04:22 PM
          Please respond to
          Tivoli NetView Discussions <nv-l AT lists.ca.ibm DOT com>


To

<nv-l AT lists.ca.ibm DOT com>

cc


Subject

[NV-L] TEC Event Forwarding Intermittently Failing on 7.1.5

Ok this got bounced yesterday for size, here is a resend:

Well here are the results after editing the TECIO to this:

#

# MODULE = TECIO

#

TECIO MINOR /usr/ov/log/nvtecad.log

TECIO MAJOR /usr/ov/log/nvtecad.log

TECIO FATAL /usr/ov/log/nvtecad.log

TECIO LOW /usr/ov/log/nvtecad.log

TECIO NORMAL /usr/ov/log/nvtecad.log

TECIO VERBOSE /usr/ov/log/nvtecad.log

The tecad_nv6k process again jumped to 50% of use after the first TEC event was forwarded and never dropped back down (been 10 minutes) thus the closing event when the node came back up was never forwarded…

The nvtecad.log file after restarting tecad_nv6k svc:

Wed Dec 13 15:26:26 2006 LOW: TECIO , err 00, .\tec_io.c line 0110: Initializing T/EC interface ...

Wed Dec 13 15:26:26 2006 LOW: TECIO , err 00, .\tec_io.c line 0180: T/EC interface initialization complete

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0276: Sending event to T/EC ...

TEC_ITS_INTERFACE_STATUS;source=NV6K;sub_source=NET;origin=x.x.x.101;adapter_host=NetviewServer;hostname=TestServerNode;msg='Interface Intel down. CRITICAL';category=2;ifstatus=2;ifname=762;hostaddr=x.x.x.101;nvhostname=y.y.y.30;END

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0313: Event sent to T/EC

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0276: Sending event to T/EC ...

TEC_ITS_NODE_STATUS;source=NV6K;sub_source=NET;origin=x.x.x.101;adapter_host=NetviewServer;hostname=TestServerNode;msg='Node Down. ';category=2;nodestatus=2;nvhostname=y.y.y.30;END

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0313: Event sent to T/EC

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0276: Sending event to T/EC ...

TEC_ITS_SEGMENT_STATUS;source=NV6K;sub_source=NET;origin=y.y.y.30;adapter_host=NetviewServer;category=2;msg='Segment x.x.Segment1 Marginal.';segstatus=3;nvhostname=y.y.y.30;END

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0302: Event filtered by tec_put_event()

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0303: No event sent to T/EC

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0276: Sending event to T/EC ...

TEC_ITS_NETWORK_STATUS;source=NV6K;sub_source=NET;origin=y.y.y.30;adapter_host=NetviewServer;category=2;msg='Network x.x

Marginal.';netstatus=3;nvhostname=y.y.y.30;END

Wed Dec 13 15:27:31 2006 LOW: TECIO , err 00, .\tec_io.c line 0313: Event sent to T/EC

Again, thanks for any help anyone can give I’m long since out of ideas…

Eric

      -----Original Message-----
      From:
      nv-l-bounces AT lists.ca.ibm DOT com [mailto:nv-l-bounces AT lists.ca.ibm DOT com] On Behalf Of James Shanks
      Sent:
      Wednesday, December 13, 2006 1:24 PM
      To:
      Tivoli NetView Discussions
      Subject:
      RE: [NV-L] TEC Event Forwarding Intermittently Failing on 7.1.5

      You can edit tecad_nv6k.err to get some tracing. The first thing to do is to route all messages for TECIO to the log file and see what you get after the thing restarts. If NetView isn't sending the event on to the TEC library code, you should notice that here. But if you see the event was sent, then the TEC library has it, and what happens after that is a TEC issue.

      The TEC folks would have to give you a tec_diag_config file (or you can get one from the TEC SDK) and have you run their trace.


      James Shanks
      Level 3 Support for Tivoli NetView for UNIX and Windows
      Network Availability Management
      Network Management - Development
      Tivoli Software, IBM Corp
      Inactive hide details for "Cooprider, Eric" <ecooprider AT ycsd.york.va DOT us>"Cooprider, Eric" <ecooprider AT ycsd.york.va DOT us>

              _______________________________________________
              NV-L mailing list
              NV-L AT lists.ca.ibm DOT com
              Unsubscribe:NV-L-leave AT lists.ca.ibm DOT com
              http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to internal IBM'ers only)

GIF image

GIF image

GIF image

GIF image

_______________________________________________
NV-L mailing list
NV-L AT lists.ca.ibm DOT com
Unsubscribe:NV-L-leave AT lists.ca.ibm DOT com
http://lists.ca.ibm.com/mailman/listinfo/nv-l (Browser access limited to 
internal IBM'ers only)