Veritas-bu

[Veritas-bu] network errors

2000-11-14 14:40:40
Subject: [Veritas-bu] network errors
From: Ravi Channavajhala ravi.channavajhala AT csfb DOT com
Date: Tue, 14 Nov 2000 14:40:40 -0500 (EST)
For those who are interested, here is the real problem.
It just happened that I defined a new interface bkup02-e2
and plumbed it and brought it on line my media server when
we are having problems.

When I added the new interface bkup02-e2, I didnt update the
client's bp.conf.  I was under the impression that clients
will use only the ones defined in their local bp.conf.  However,
I defined the newly defined interface in master's bp.conf.
When a client backup request came, the master tried to send
them on bkup02-e2, causing the simple authentication based
on 'hostnames' failed and as a result, network errors - 54.
When I ran a traceroute on a client from media server, they
are going via bkup02-e2.  So, I defined the bkup02-e2 in all
the failing client's bp.conf and all the problems disappeared.

Although my problem is solved, I would like to hear from someone,
why client's bp.conf entries are strictly not followed and how
does master make a determination to override the client's bp.conf
entries?

-ravi

On Fri, 3 Nov 2000, Ravi Channavajhala wrote:

ravi>Hi All,
ravi>
ravi>I am seeing a bunch of 41/54/57/59 errors from some of my NT clients
ravi>backing up to a Solaris Netbackup (3.2 GA) server.  Often times, I
ravi>see the backup start and while in the middle of backup it times out
ravi>with error 54.  I have upped teh CLIENT_CONNECT_TIMEOUT and
ravi>CLIENT_READ_TIMEOUT values in bp.conf to 1800 and extensively
ravi>trouble shooted the network.
ravi>
ravi>My setup is one media server, with several NICs, and each NIC
ravi>was defined in the bp.conf eg.,
ravi>
ravi>SERVER = bkup02
ravi>SERVER = bkup02-e0
ravi>SERVER = bkup02-f0
ravi>
ravi>Enabled verbose (99) for logging.  Looking thru client's bpcd
ravi>shows that it connected to bkup02-f0, which is an interface
ravi>on the machine bkup02 and tries connecting to privilged ports
ravi>in the range of 600-900 and gives up after 20 tries with
ravi>
ravi>09:22:11 [258] <2> bpcd main: connection refused for port 882, try again 
ravi>09:22:14 [258] <2> bpcd main: connection refused for port 883, try again
ravi>and so on...it tries a range of ports and gives up.  Before this stage,
ravi>it acutually tells me that
ravi>
ravi>09:04:28 [258] <2> bpcd main: new fork cmd = <whatever>
ravi>          <2> bpcd main: got socket for input 224
ravi>           ....
ravi>           <2> bpcd main: got socket for output 204, lport=883 and so
ravi>on..
ravi>
ravi>I upped the TCP timeout paramaters on the NT clients in the registry
ravi>and so on.  Ran snoop and etc., without any luck.  What is interesting
ravi>is sometimes the backup for the same client succeeds, and some times it
ravi>just plain and simple errors out with one of 41/54/57/59.  
ravi>
ravi>Can someone give pointers to take the debugging and diagnosing this
ravi>to next level, beyond running 'bpclntcmd' from the client, verifying
ravi>the host name resolution, bp.conf etc.,  Thanks.
ravi>
ravi>-ravi




<Prev in Thread] Current Thread [Next in Thread>