Veritas-bu

[Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris Clients

2005-11-13 03:11:09
Subject: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris Clients
From: Eric.Ljungblad AT CopleyPress DOT com (Eric Ljungblad)
Date: Sun, 13 Nov 2005 00:11:09 -0800
This is a multi-part message in MIME format.

------_=_NextPart_001_01C5E829.E9972A1F
Content-Type: text/plain;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

Just use a VPN,=20

________________________________

From: veritas-bu-admin AT mailman.eng.auburn DOT edu on behalf of =
Mark.Donaldson AT cexp DOT com
Sent: Tue 11/8/2005 10:44 AM
To: jpiszcz AT servervault DOT com; Mark.Donaldson AT cexp DOT com; =
pkeating AT bank-banque-canada DOT ca
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris =
Clients



Once you do something successfully, the addresses may be stored for a =
while
in the arp cache making later tests behave differently.

You might want to flush the arp cache for these box address between =
tests.

-M
-----Original Message-----
From: Piszcz, Justin [mailto:jpiszcz AT servervault DOT com]
Sent: Tuesday, November 08, 2005 11:31 AM
To: Mark.Donaldson AT cexp DOT com; pkeating AT bank-banque-canada DOT ca
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients


Good suggestion, I tried the REQUIRED_INTERFACE, no luck there as it is =
not
able to "Reach" the open port where bpcd is listening.
The traceroute works fine however and I can ping the box for hours with =
0%
packet loss, it seems to only affect that port, HMM, I wonder I I =
changed
the bpcd port?




From: Mark.Donaldson AT cexp DOT com [mailto:Mark.Donaldson AT cexp DOT com]
Sent: Tuesday, November 08, 2005 1:25 PM
To: Piszcz, Justin; pkeating AT bank-banque-canada DOT ca
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients

Try using the traceroute facilty from master to client & the reverse.  =
It'll
tell you what interface the traffic is leaving.

If there's any doubt to the route to be taken, then use the
REQUIRED_INTERFACE in the bp.conf file to force traffic from the client =
to
the master over a specific interface.

-M
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Piszcz, =
Justin
Sent: Tuesday, November 08, 2005 10:35 AM
To: Paul Keating
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients
They are multi-homed but with completely different subnet masks and =
routes.
Again, there are 5 machines with only 1 path to the backup server, there =
are
other nics but with separate interfaces+routes+Subnets.

I see what you are saying but this is a completely vlan+server-isolated
backup network.

I've checked with the network guys they are telling me is a box problem.

Any other ideas?

Justin.




From: Paul Keating [mailto:pkeating AT bank-banque-canada DOT ca]
Sent: Tuesday, November 08, 2005 12:16 PM
To: Piszcz, Justin
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients

Ummm....are any of your systems multi homed?

looks like an asymmetrical routing issue we had here recently.

backup server talks to client on NIC A, but client replies back to =
server
from NIC B via an alternate path.

the ARP table on your switch that the backup server and NIC A are =
connected
to doesn't get populated with the MAC of the client NIC A untill the =
client
ARPs or tries to talk to via NIC A....then communication works for 5-10
minutes or untill your ARP cache refreshes, then communication is broken
till the client tries to talk to the server again.

I'm thinking it's likely not an issue if your clients all are single NIC =
or
connected to a single switch along with the backup server.

check with your network guys to see if they're seeing any broadcasting =
on
the switches your backup server and clients are attached to.

Paul
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf Of Piszcz, =
Justin
Sent: November 8, 2005 11:59 AM
To: Piszcz, Justin; Patrick Whelan
Cc: veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris
Clients
Name resolution is setup through /etc/hosts on the backup server, no
external DNS servers are used.
The subnet is the same yes.
The bp.conf is the same for the most part, the error occurs on both =
systems.

MASTER -> CLIENTS:
jpiszcz@backup01# telnet 172.16.0.135 bpcd
Trying 172.16.0.135...
^C
jpiszcz@backup01# telnet 172.16.0.136 bpcd
Trying 172.16.0.136...

HANGs and HANGs..-How would I debug this problem?

CLIENTS -> MASTER
box1# telnet backup01 bpcd
Trying 172.16.0.2...
Connected to backup01.
Escape character is '^]'.

box2# telnet backup01 bpcd
Trying 172.16.0.2...
Connected to backup01.
Escape character is '^]'.

Now the EXTREMELY WEIRD PART!!! (now I can MASTER->CLIENTS) - NO ISSUE
jpiszcz@backup01# telnet 172.16.0.135 bpcd
Trying 172.16.0.135...
Connected to 172.16.0.135.
Escape character is '^]'.
^]
telnet> Connection to 172.16.0.135 closed.
jpiszcz@backup01# telnet 172.16.0.136 bpcd
Trying 172.16.0.136...
Connected to 172.16.0.136.
Escape character is '^]'.

Any freaking clue what is going on here?

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu



------_=_NextPart_001_01C5E829.E9972A1F
Content-Type: text/html;
        charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3Diso-8859-1">=0A=
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">=0A=
<HTML>=0A=
<HEAD>=0A=
=0A=
<META NAME=3D"Generator" CONTENT=3D"MS Exchange Server version =
6.5.7226.0">=0A=
<TITLE>RE: [Veritas-bu] Very ODD problem with NB5.1MP3a + 2 Solaris =
Clients</TITLE>=0A=
</HEAD>=0A=
<BODY>=0A=
<DIV id=3DidOWAReplyText67502 dir=3Dltr>=0A=
<DIV dir=3Dltr><FONT face=3DArial color=3D#000000 size=3D2>Just use a =
VPN, =0A=
</FONT></DIV></DIV>=0A=
<DIV dir=3Dltr><BR>=0A=
<HR tabIndex=3D-1>=0A=
<FONT face=3DTahoma size=3D2><B>From:</B> =
veritas-bu-admin AT mailman.eng.auburn DOT edu on =0A=
behalf of Mark.Donaldson AT cexp DOT com<BR><B>Sent:</B> Tue 11/8/2005 10:44 
=0A=
AM<BR><B>To:</B> jpiszcz AT servervault DOT com; Mark.Donaldson AT cexp DOT 
com; =0A=
pkeating AT bank-banque-canada DOT ca<BR><B>Cc:</B> =0A=
veritas-bu AT mailman.eng.auburn DOT edu<BR><B>Subject:</B> RE: [Veritas-bu] =
Very ODD =0A=
problem with NB5.1MP3a + 2 Solaris Clients<BR></FONT><BR></DIV>=0A=
<DIV>=0A=
<P><FONT size=3D2>Once you do something successfully, the addresses may =
be stored =0A=
for a while<BR>in the arp cache making later tests behave =0A=
differently.<BR><BR>You might want to flush the arp cache for these box =
address =0A=
between tests.<BR><BR>-M<BR>-----Original Message-----<BR>From: Piszcz, =
Justin =0A=
[<A =0A=
href=3D"mailto:jpiszcz AT servervault DOT com">mailto:jpiszcz AT servervault 
DOT com</A=
>]<BR>Sent: =0A=
Tuesday, November 08, 2005 11:31 AM<BR>To: Mark.Donaldson AT cexp DOT com; =0A=
pkeating AT bank-banque-canada DOT ca<BR>Cc: =0A=
veritas-bu AT mailman.eng.auburn DOT edu<BR>Subject: RE: [Veritas-bu] Very ODD =
problem =0A=
with NB5.1MP3a + 2 Solaris<BR>Clients<BR><BR><BR>Good suggestion, I =
tried the =0A=
REQUIRED_INTERFACE, no luck there as it is not<BR>able to "Reach" the =
open port =0A=
where bpcd is listening.<BR>The traceroute works fine however and I can =
ping the =0A=
box for hours with 0%<BR>packet loss, it seems to only affect that port, =
HMM, I =0A=
wonder I I changed<BR>the bpcd port?<BR><BR><BR><BR><BR>From: =0A=
Mark.Donaldson AT cexp DOT com [<A =0A=
href=3D"mailto:Mark.Donaldson AT cexp DOT com">mailto:Mark.Donaldson AT cexp 
DOT com</A=
>]<BR>Sent: =0A=
Tuesday, November 08, 2005 1:25 PM<BR>To: Piszcz, Justin; =0A=
pkeating AT bank-banque-canada DOT ca<BR>Cc: =0A=
veritas-bu AT mailman.eng.auburn DOT edu<BR>Subject: RE: [Veritas-bu] Very ODD =
problem =0A=
with NB5.1MP3a + 2 Solaris<BR>Clients<BR><BR>Try using the traceroute =
facilty =0A=
from master to client &amp; the reverse.&nbsp; It'll<BR>tell you what =
interface =0A=
the traffic is leaving.<BR><BR>If there's any doubt to the route to be =
taken, =0A=
then use the<BR>REQUIRED_INTERFACE in the bp.conf file to force traffic =
from the =0A=
client to<BR>the master over a specific =
interface.<BR><BR>-M<BR>-----Original =0A=
Message-----<BR>From: veritas-bu-admin AT mailman.eng.auburn DOT edu<BR>[<A =0A=
href=3D"mailto:veritas-bu-admin AT mailman.eng.auburn DOT 
edu">mailto:veritas-bu=
-admin AT mailman.eng.auburn DOT edu</A>]On =0A=
Behalf Of Piszcz, Justin<BR>Sent: Tuesday, November 08, 2005 10:35 =
AM<BR>To: =0A=
Paul Keating<BR>Cc: veritas-bu AT mailman.eng.auburn DOT edu<BR>Subject: RE: 
=0A=
[Veritas-bu] Very ODD problem with NB5.1MP3a + 2 =
Solaris<BR>Clients<BR>They are =0A=
multi-homed but with completely different subnet masks and =
routes.<BR>Again, =0A=
there are 5 machines with only 1 path to the backup server, there =
are<BR>other =0A=
nics but with separate interfaces+routes+Subnets.<BR><BR>I see what you =
are =0A=
saying but this is a completely vlan+server-isolated<BR>backup =0A=
network.<BR><BR>I've checked with the network guys they are telling me =
is a box =0A=
problem.<BR><BR>Any other ideas?<BR><BR>Justin.<BR><BR><BR><BR><BR>From: =
Paul =0A=
Keating [<A =0A=
href=3D"mailto:pkeating AT bank-banque-canada DOT 
ca">mailto:pkeating@bank-banqu=
e-canada.ca</A>]<BR>Sent: =0A=
Tuesday, November 08, 2005 12:16 PM<BR>To: Piszcz, Justin<BR>Cc: =0A=
veritas-bu AT mailman.eng.auburn DOT edu<BR>Subject: RE: [Veritas-bu] Very ODD =
problem =0A=
with NB5.1MP3a + 2 Solaris<BR>Clients<BR><BR>Ummm....are any of your =
systems =0A=
multi homed?<BR><BR>looks like an asymmetrical routing issue we had here =0A=
recently.<BR><BR>backup server talks to client on NIC A, but client =
replies back =0A=
to server<BR>from NIC B via an alternate path.<BR><BR>the ARP table on =
your =0A=
switch that the backup server and NIC A are connected<BR>to doesn't get =0A=
populated with the MAC of the client NIC A untill the client<BR>ARPs or =
tries to =0A=
talk to via NIC A....then communication works for 5-10<BR>minutes or =
untill your =0A=
ARP cache refreshes, then communication is broken<BR>till the client =
tries to =0A=
talk to the server again.<BR><BR>I'm thinking it's likely not an issue =
if your =0A=
clients all are single NIC or<BR>connected to a single switch along with =
the =0A=
backup server.<BR><BR>check with your network guys to see if they're =
seeing any =0A=
broadcasting on<BR>the switches your backup server and clients are =
attached =0A=
to.<BR><BR>Paul<BR>-----Original Message-----<BR>From: =0A=
veritas-bu-admin AT mailman.eng.auburn DOT edu<BR>[<A =0A=
href=3D"mailto:veritas-bu-admin AT mailman.eng.auburn DOT 
edu">mailto:veritas-bu=
-admin AT mailman.eng.auburn DOT edu</A>] =0A=
On Behalf Of Piszcz, Justin<BR>Sent: November 8, 2005 11:59 AM<BR>To: =
Piszcz, =0A=
Justin; Patrick Whelan<BR>Cc: =
veritas-bu AT mailman.eng.auburn DOT edu<BR>Subject: RE: =0A=
[Veritas-bu] Very ODD problem with NB5.1MP3a + 2 =
Solaris<BR>Clients<BR>Name =0A=
resolution is setup through /etc/hosts on the backup server, =
no<BR>external DNS =0A=
servers are used.<BR>The subnet is the same yes.<BR>The bp.conf is the =
same for =0A=
the most part, the error occurs on both systems.<BR><BR>MASTER -&gt; =0A=
CLIENTS:<BR>jpiszcz@backup01# telnet 172.16.0.135 bpcd<BR>Trying =0A=
172.16.0.135...<BR>^C<BR>jpiszcz@backup01# telnet 172.16.0.136 =
bpcd<BR>Trying =0A=
172.16.0.136...<BR><BR>HANGs and HANGs..-How would I debug this =0A=
problem?<BR><BR>CLIENTS -&gt; MASTER<BR>box1# telnet backup01 =
bpcd<BR>Trying =0A=
172.16.0.2...<BR>Connected to backup01.<BR>Escape character is =0A=
'^]'.<BR><BR>box2# telnet backup01 bpcd<BR>Trying =
172.16.0.2...<BR>Connected to =0A=
backup01.<BR>Escape character is '^]'.<BR><BR>Now the EXTREMELY WEIRD =
PART!!! =0A=
(now I can MASTER-&gt;CLIENTS) - NO ISSUE<BR>jpiszcz@backup01# telnet =0A=
172.16.0.135 bpcd<BR>Trying 172.16.0.135...<BR>Connected to =0A=
172.16.0.135.<BR>Escape character is '^]'.<BR>^]<BR>telnet&gt; =
Connection to =0A=
172.16.0.135 closed.<BR>jpiszcz@backup01# telnet 172.16.0.136 =
bpcd<BR>Trying =0A=
172.16.0.136...<BR>Connected to 172.16.0.136.<BR>Escape character is =0A=
'^]'.<BR><BR>Any freaking clue what is going on =0A=
here?<BR><BR>_______________________________________________<BR>Veritas-b=
u =0A=
maillist&nbsp; -&nbsp; Veritas-bu AT mailman.eng.auburn DOT edu<BR><A =0A=
href=3D"http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu";>http:/=
/mailman.eng.auburn.edu/mailman/listinfo/veritas-bu</A><BR></FONT></P></D=
IV>=0A=
=0A=
</BODY>=0A=
</HTML>
------_=_NextPart_001_01C5E829.E9972A1F--