Veritas-bu

[Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris Clie nts

2005-11-08 13:45:02
Subject: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris Clie nts
From: Mark.Donaldson AT cexp DOT com (Mark.Donaldson AT cexp DOT com)
Date: Tue, 8 Nov 2005 11:45:02 -0700
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C5E494.8714AC30
Content-Type: text/plain;
        charset="iso-8859-1"

(RESENDING due to oversize)

Try using the traceroute facilty from master to client & the reverse.  It'll
tell you what interface the traffic is leaving.

If there's any doubt to the route to be taken, then use the
REQUIRED_INTERFACE in the bp.conf file to force traffic from the client to
the master over a specific interface.

-M
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Piszcz, 
Justin
Sent: Tuesday, November 08, 2005 10:35 AM
To: Paul Keating
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients


They are multi-homed but with completely different subnet masks and routes.
Again, there are 5 machines with only 1 path to the backup server, there are
other nics but with separate interfaces+routes+Subnets.
 
I see what you are saying but this is a completely vlan+server-isolated
backup network.
 
I've checked with the network guys they are telling me is a box problem.
 
Any other ideas?
 
Justin.
 



From: Paul Keating [mailto:pkeating AT bank-banque-canada DOT ca] 
Sent: Tuesday, November 08, 2005 12:16 PM
To: Piszcz, Justin
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients
 
Ummm....are any of your systems multi homed?
 
looks like an asymmetrical routing issue we had here recently.
 
backup server talks to client on NIC A, but client replies back to server
from NIC B via an alternate path.
 
the ARP table on your switch that the backup server and NIC A are connected
to doesn't get populated with the MAC of the client NIC A untill the client
ARPs or tries to talk to via NIC A....then communication works for 5-10
minutes or untill your ARP cache refreshes, then communication is broken
till the client tries to talk to the server again.
 
I'm thinking it's likely not an issue if your clients all are single NIC or
connected to a single switch along with the backup server.
 
check with your network guys to see if they're seeing any broadcasting on
the switches your backup server and clients are attached to.
 
Paul
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf Of Piszcz, 
Justin
Sent: November 8, 2005 11:59 AM
To: Piszcz, Justin; Patrick Whelan
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients
Name resolution is setup through /etc/hosts on the backup server, no
external DNS servers are used.
The subnet is the same yes.
The bp.conf is the same for the most part, the error occurs on both systems.
 
MASTER -> CLIENTS:
jpiszcz@backup01# telnet 172.16.0.135 bpcd
Trying 172.16.0.135...
^C
jpiszcz@backup01# telnet 172.16.0.136 bpcd
Trying 172.16.0.136...
 
HANGs and HANGs..-How would I debug this problem?
 
CLIENTS -> MASTER
box1# telnet backup01 bpcd
Trying 172.16.0.2...
Connected to backup01.
Escape character is '^]'.
 
box2# telnet backup01 bpcd
Trying 172.16.0.2...
Connected to backup01.
Escape character is '^]'.
 
Now the EXTREMELY WEIRD PART!!! (now I can MASTER->CLIENTS) - NO ISSUE
jpiszcz@backup01# telnet 172.16.0.135 bpcd
Trying 172.16.0.135...
Connected to 172.16.0.135.
Escape character is '^]'.
^]
telnet> Connection to 172.16.0.135 closed.
jpiszcz@backup01# telnet 172.16.0.136 bpcd
Trying 172.16.0.136...
Connected to 172.16.0.136.
Escape character is '^]'.
 
Any freaking clue what is going on here?
 

------_=_NextPart_001_01C5E494.8714AC30
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
<HTML>
<HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
5.5.2658.2">
<TITLE>RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris =
Clients</TITLE>
</HEAD>
<BODY>

<P><FONT SIZE=3D2>(RESENDING due to oversize)</FONT>
</P>

<P><FONT SIZE=3D2>Try using the traceroute facilty from master to =
client &amp; the reverse.&nbsp; It'll tell you what interface the =
traffic is leaving.</FONT></P>

<P><FONT SIZE=3D2>If there's any doubt to the route to be taken, then =
use the REQUIRED_INTERFACE in the bp.conf file to force traffic from =
the client to the master over a specific interface.</FONT></P>

<P><FONT SIZE=3D2>-M</FONT>
<BR><FONT SIZE=3D2>-----Original Message-----</FONT>
<BR><FONT SIZE=3D2>From: veritas-bu-admin AT mailman.eng.auburn DOT edu [<A =
HREF=3D"mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu">mailto:veritas-b=
u-admin AT mailman.eng.auburn DOT edu</A>]On Behalf Of Piszcz, =
Justin</FONT></P>

<P><FONT SIZE=3D2>Sent: Tuesday, November 08, 2005 10:35 AM</FONT>
<BR><FONT SIZE=3D2>To: Paul Keating</FONT>
<BR><FONT SIZE=3D2>Cc: veritas-bu AT mailman.eng.auburn DOT edu</FONT>
<BR><FONT SIZE=3D2>Subject: RE: [Veritas-bu] Very ODD problem with =
NB5.1MP3a + 2 Solaris Clients</FONT>
</P>
<BR>

<P><FONT SIZE=3D2>They are multi-homed but with completely different =
subnet masks and routes.</FONT>
<BR><FONT SIZE=3D2>Again, there are 5 machines with only 1 path to the =
backup server, there are other nics but with separate =
interfaces+routes+Subnets.</FONT></P>

<P><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>I see what you are saying but this is a completely =
vlan+server-isolated backup network.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>I've checked with the network guys they are telling =
me is a box problem.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Any other ideas?</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Justin.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
</P>
<BR>
<BR>

<P><FONT SIZE=3D2>From: Paul Keating [<A =
HREF=3D"mailto:pkeating AT bank-banque-canada DOT ca">mailto:pkeating@bank-banq=
ue-canada.ca</A>] </FONT>
<BR><FONT SIZE=3D2>Sent: Tuesday, November 08, 2005 12:16 PM</FONT>
<BR><FONT SIZE=3D2>To: Piszcz, Justin</FONT>
<BR><FONT SIZE=3D2>Cc: veritas-bu AT mailman.eng.auburn DOT edu</FONT>
<BR><FONT SIZE=3D2>Subject: RE: [Veritas-bu] Very ODD problem with =
NB5.1MP3a + 2 Solaris Clients</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Ummm....are any of your systems multi homed?</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>looks like an asymmetrical routing issue we had here =
recently.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>backup server talks to client on NIC A, but client =
replies back to server from NIC B via an alternate path.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>the ARP table on your switch that the backup server =
and NIC A are connected to doesn't get populated with the MAC of the =
client NIC A untill the client ARPs or tries to talk to via NIC =
A....then communication works for 5-10 minutes or untill your ARP cache =
refreshes, then communication is broken till the client tries to talk =
to the server again.</FONT></P>

<P><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>I'm thinking it's likely not an issue if your =
clients all are single NIC or connected to a single switch along with =
the backup server.</FONT></P>

<P><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>check with your network guys to see if they're =
seeing any broadcasting on the switches your backup server and clients =
are attached to.</FONT></P>

<P><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Paul</FONT>
<BR><FONT SIZE=3D2>-----Original Message-----</FONT>
<BR><FONT SIZE=3D2>From: veritas-bu-admin AT mailman.eng.auburn DOT edu [<A =
HREF=3D"mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu">mailto:veritas-b=
u-admin AT mailman.eng.auburn DOT edu</A>] On Behalf Of Piszcz, Justin</FONT></=
P>

<P><FONT SIZE=3D2>Sent: November 8, 2005 11:59 AM</FONT>
<BR><FONT SIZE=3D2>To: Piszcz, Justin; Patrick Whelan</FONT>
<BR><FONT SIZE=3D2>Cc: veritas-bu AT mailman.eng.auburn DOT edu</FONT>
<BR><FONT SIZE=3D2>Subject: RE: [Veritas-bu] Very ODD problem with =
NB5.1MP3a + 2 Solaris Clients</FONT>
<BR><FONT SIZE=3D2>Name resolution is setup through /etc/hosts on the =
backup server, no external DNS servers are used.</FONT>
<BR><FONT SIZE=3D2>The subnet is the same yes.</FONT>
<BR><FONT SIZE=3D2>The bp.conf is the same for the most part, the error =
occurs on both systems.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>MASTER -&gt; CLIENTS:</FONT>
<BR><FONT SIZE=3D2>jpiszcz@backup01# telnet 172.16.0.135 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.135...</FONT>
<BR><FONT SIZE=3D2>^C</FONT>
<BR><FONT SIZE=3D2>jpiszcz@backup01# telnet 172.16.0.136 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.136...</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>HANGs and HANGs..-How would I debug this =
problem?</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>CLIENTS -&gt; MASTER</FONT>
<BR><FONT SIZE=3D2>box1# telnet backup01 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.2...</FONT>
<BR><FONT SIZE=3D2>Connected to backup01.</FONT>
<BR><FONT SIZE=3D2>Escape character is '^]'.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>box2# telnet backup01 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.2...</FONT>
<BR><FONT SIZE=3D2>Connected to backup01.</FONT>
<BR><FONT SIZE=3D2>Escape character is '^]'.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Now the EXTREMELY WEIRD PART!!! (now I can =
MASTER-&gt;CLIENTS) - NO ISSUE</FONT>
<BR><FONT SIZE=3D2>jpiszcz@backup01# telnet 172.16.0.135 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.135...</FONT>
<BR><FONT SIZE=3D2>Connected to 172.16.0.135.</FONT>
<BR><FONT SIZE=3D2>Escape character is '^]'.</FONT>
<BR><FONT SIZE=3D2>^]</FONT>
<BR><FONT SIZE=3D2>telnet&gt; Connection to 172.16.0.135 closed.</FONT>
<BR><FONT SIZE=3D2>jpiszcz@backup01# telnet 172.16.0.136 bpcd</FONT>
<BR><FONT SIZE=3D2>Trying 172.16.0.136...</FONT>
<BR><FONT SIZE=3D2>Connected to 172.16.0.136.</FONT>
<BR><FONT SIZE=3D2>Escape character is '^]'.</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
<BR><FONT SIZE=3D2>Any freaking clue what is going on here?</FONT>
<BR><FONT SIZE=3D2>&nbsp;</FONT>
</P>

</BODY>
</HTML>
------_=_NextPart_001_01C5E494.8714AC30--

<Prev in Thread] Current Thread [Next in Thread>