This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.
Sorry, I didn't see this earlier.
Sun has always had problems with their DNS support. Anyone who used SunOS
4 or early versions of Solaris 2 knows what a pleasure it was to get their
systems to use DNS. Sun basically assumed for a long while that everyone
would want to use NIS.
I'm not sure whether it was a gethostbyname or getaddrbyname call that
caused us problems. The actual exception (which we found using truss) was
caused by /etc/.name_service_door. Solaris 7 and 8 (and maybe 2.6) have
this new thing called a "door" for services. They can be identified by a
"D" (not a "d") when doing a ls -l. Anyway, this door is managed/used by
nscd, which is how we got to the root cause, nscd not running.
Recently, we ran into a second problem where the conf file for nscd was not
configured correctly for caching hosts names. The truss output was very
similar.
To answer why nscd was turned off, it was driven by some security people
using the "turn off all unnecessary services" philosophy. And as to
whether it has caused us other problems, perhaps it has. The null pointer
return was something we read about on the Sun managers list, and may not be
totally accurate.
-----Original Message-----
From: James Shanks [mailto:SHANKS AT us.tivoli DOT com]
Sent: Friday, July 20, 2001 9:52 AM
To: 'IBM NetView Discussion'
Subject: RE: [NV-L] TIPN Inventory - NV Nodes Problem
Chris -
I'm curious, and like the cat, that will probably get me killed someday :-)
But I just have to ask about this. I don't work on TIPN nor know very much
about it, but my real ignornace is the internals of Solaris.
On AIX, where TIPN was born, "gethostbyname" is an operating system call,
there is no daemon or service involved, and it never, ever returns a null
pointer so far as I know. So what is this nscd service that you turned
off, and why did you do that? I am curious because it seems to me that the
TIPN guys could just say, "Well, sorry, but we can't run effectively in
that environment, so what you are doing is not supported."
The reason I ask is because NetView proper, especially netmon, and the
event processing daemons, trapd, nvcorrd,nvserverd,actionsvr, do
"gethostbyname" all over the place. If that is failing, I'm surprised that
you aren't having serious NetView problems too. Or have you?
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
---------------------- Forwarded by James Shanks/Raleigh/IBM on 07/20/2001
07:55 AM ---------------------------
"Cowan, Chris" <Chris.Cowan AT 2ndWaveinc DOT com>@tkg.com on 07/19/2001
05:44:13
PM
Please respond to IBM NetView Discussion <nv-l AT tkg DOT com>
Sent by: owner-nv-l AT tkg DOT com
To: "'IBM NetView Discussion'" <nv-l AT tkg DOT com>
cc:
Subject: RE: [NV-L] TIPN Inventory - NV Nodes Problem
Well, we discovered the root cause of the problem, using truss.
We had turned off the nscd service, as part of our hardening procedure.
Starting nscd solved the problem. The key was noticing an open64() call
against /etc/.name_service_door just before a SIGSEGV was thrown.
There is an application code adjustment that should be made to handle a
null pointer return from gethostbyname, under these circumstances, on
Solaris. (There is quite a bit of discussion about this on the Sun
Managers list). It appears that the TIPN application code doesn't
account for this.
We have forwarded this information to Tivoli support.
-----Original Message-----
From: Cowan, Chris [mailto:Chris.Cowan AT 2ndwaveinc DOT com]
Sent: Monday, July 16, 2001 10:13 AM
To: IBM NetView Discussion (E-mail)
Subject: [NV-L] TIPN Inventory - NV Nodes Problem
Since we've had a problem with TIPN Inventory that has been open for 3
months with support, I thought I'd share it with the list, and see if
anyone has run into this. In a nutshell, we noticed of that nv_nodes
table was not being loading. (nv_interfaces, nv_segments, and
nv_networks load just fine). Our platform is Solaris 2.7 with recent
kernel patches, and NV 6.0.1 or 6.0.2.
We had about 6 production setups of NetView of which about 4 were
malfunctioning. The ones that were working were running TMF 3.6.2 with
patches. We believe we have isolated the probem in our testing to the
libtmf.so in TMF 3.6.4 (and also 3.7.x). When we perform this upgrade, we
get an "UNHANDLED EXCEPTION LOOKING UP FIELDS" message and the nv_nodes
table is not loaded. Tracing the RIM does not help, BTW. This exception
is thrown before the upload through the RIM is attempted.
Support claims that they are unable to reproduce this problem, by the
application of the TMF patch. So, most recently we tried several things
including multiple orders of installation for the TMF, Inv, NV, and TIPN
components required. In all cases, we are consistently able to cause the
failure with the TMF 3.6.4 patch.
Has anyone else seen this problem, or does this ring a bell with you in
any way. If so, feel free to contact me.
<<Christopher Cowan (E-mail).vcf>>
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
Sorry, I didn't see this earlier.
Sun has always had problems with their DNS support. Anyone who used SunOS 4 or early versions of Solaris 2 knows what a pleasure it was to get their systems to use DNS. Sun basically assumed for a long while that everyone would want to use NIS.
I'm not sure whether it was a gethostbyname or getaddrbyname call that caused us problems. The actual exception (which we found using truss) was caused by /etc/.name_service_door. Solaris 7 and 8 (and maybe 2.6) have this new thing called a "door" for services. They can be identified by a "D" (not a "d") when doing a ls -l. Anyway, this door is managed/used by nscd, which is how we got to the root cause, nscd not running.
Recently, we ran into a second problem where the conf file for nscd was not configured correctly for caching hosts names. The truss output was very similar.
To answer why nscd was turned off, it was driven by some security people using the "turn off all unnecessary services" philosophy. And as to whether it has caused us other problems, perhaps it has. The null pointer return was something we read about on the Sun managers list, and may not be totally accurate.
-----Original Message-----
From: James Shanks [mailto:SHANKS AT us.tivoli DOT com]
Sent: Friday, July 20, 2001 9:52 AM
To: 'IBM NetView Discussion'
Subject: RE: [NV-L] TIPN Inventory - NV Nodes Problem
Chris -
I'm curious, and like the cat, that will probably get me killed someday :-)
But I just have to ask about this. I don't work on TIPN nor know very much
about it, but my real ignornace is the internals of Solaris.
On AIX, where TIPN was born, "gethostbyname" is an operating system call,
there is no daemon or service involved, and it never, ever returns a null
pointer so far as I know. So what is this nscd service that you turned
off, and why did you do that? I am curious because it seems to me that the
TIPN guys could just say, "Well, sorry, but we can't run effectively in
that environment, so what you are doing is not supported."
The reason I ask is because NetView proper, especially netmon, and the
event processing daemons, trapd, nvcorrd,nvserverd,actionsvr, do
"gethostbyname" all over the place. If that is failing, I'm surprised that
you aren't having serious NetView problems too. Or have you?
James Shanks
Team Leader, Level 3 Support
Tivoli NetView for UNIX and NT
---------------------- Forwarded by James Shanks/Raleigh/IBM on 07/20/2001
07:55 AM ---------------------------
"Cowan, Chris" <Chris.Cowan AT 2ndWaveinc DOT com>@tkg.com on 07/19/2001 05:44:13
PM
Please respond to IBM NetView Discussion <nv-l AT tkg DOT com>
Sent by: owner-nv-l AT tkg DOT com
To: "'IBM NetView Discussion'" <nv-l AT tkg DOT com>
cc:
Subject: RE: [NV-L] TIPN Inventory - NV Nodes Problem
Well, we discovered the root cause of the problem, using truss.
We had turned off the nscd service, as part of our hardening procedure.
Starting nscd solved the problem. The key was noticing an open64() call
against /etc/.name_service_door just before a SIGSEGV was thrown.
There is an application code adjustment that should be made to handle a
null pointer return from gethostbyname, under these circumstances, on
Solaris. (There is quite a bit of discussion about this on the Sun
Managers list). It appears that the TIPN application code doesn't
account for this.
We have forwarded this information to Tivoli support.
-----Original Message-----
From: Cowan, Chris [mailto:Chris.Cowan AT 2ndwaveinc DOT com]
Sent: Monday, July 16, 2001 10:13 AM
To: IBM NetView Discussion (E-mail)
Subject: [NV-L] TIPN Inventory - NV Nodes Problem
Since we've had a problem with TIPN Inventory that has been open for 3
months with support, I thought I'd share it with the list, and see if
anyone has run into this. In a nutshell, we noticed of that nv_nodes
table was not being loading. (nv_interfaces, nv_segments, and
nv_networks load just fine). Our platform is Solaris 2.7 with recent
kernel patches, and NV 6.0.1 or 6.0.2.
We had about 6 production setups of NetView of which about 4 were
malfunctioning. The ones that were working were running TMF 3.6.2 with
patches. We believe we have isolated the probem in our testing to the
libtmf.so in TMF 3.6.4 (and also 3.7.x). When we perform this upgrade, we
get an "UNHANDLED EXCEPTION LOOKING UP FIELDS" message and the nv_nodes
table is not loaded. Tracing the RIM does not help, BTW. This exception
is thrown before the upload through the RIM is attempted.
Support claims that they are unable to reproduce this problem, by the
application of the TMF patch. So, most recently we tried several things
including multiple orders of installation for the TMF, Inv, NV, and TIPN
components required. In all cases, we are consistently able to cause the
failure with the TMF 3.6.4 patch.
Has anyone else seen this problem, or does this ring a bell with you in
any way. If so, feel free to contact me.
<<Christopher Cowan (E-mail).vcf>>
_________________________________________________________________________
NV-L List information and Archives: http://www.tkg.com/nv-l
|