Networker

Re: [Networker] NUL handshake - Firewall issue

2005-09-12 10:24:44
Subject: Re: [Networker] NUL handshake - Firewall issue
From: Stan Horwitz <stan AT TEMPLE DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 12 Sep 2005 10:23:33 -0400
On Sep 12, 2005, at 10:16 AM, Bart.Jespers AT FUJITSU-SIEMENS DOT COM wrote:

Hello,

our backups of clients located in a DMZ zone fail (a lot but not
always). this happens during full backups (high load on network) and
incremental backups. it even happens sometimes when only 1 backup is
started to test. we allready had this problem some time ago, but used
the NSR_KEEP_ALIVE variable and the problem dissapeared (mainly).
nowdays we have it for 50% of our DMZ clients (all at random)

the details of the savegroup mention: no output!
the daemon.log of the client mention lots of
09/11/05 00:05:59 nsrexecd: failed to write NUL handshake on 508: errno
154, Connection reset by peer
09/12/05 05:50:48 nsrexecd: failed to write NUL handshake on 512: errno
154, Connection reset by peer

the last backup of one of the clients mentions the succesful backup of
SYSTEM STATE, DB and FILES. drive C, D and index are still unsuccesfull,
even tough the daemon.log has the following entry:
C:\ done saving to pool 'DISK'(backupsn1_disk5) 802 MB, (more than 1
hour ago)

so disk C is done, but the details of the savegroup don't know it yet.
drive D will not be backuped up...

on the backup server there is still a nsrexec running for this client.


what can be the cause? and what can be done about it?
do we need to reboot the client server after setting NSR_KEEP_ALIVE or
is restarting networker enough?

some background info:
backup server: runing solaris 9 NW 7.1
client: windows, linux, solaris, all 7.1.3. the problem is mainly with
windows cleitns (75% of all clients are windows)
ports opened on the client in DMZ :7937-7938 (both on the firewall as
using nsrports )
ports opened to the server, follows rule: 2+3+2T+P+C= 300 ports (is to
much when only one backup is started)
on the clients the NSR_KEEP_ALIVE variable is set to 30

Try setting NSR_KEEP_ALIVE to 15 instead of 30. The optimum setting depends on how your firewall is configured.

On your Windows clients, you might also set HKEY_LOCAL_MACHINE\SYSTEM \CurrentControlSet\Services\Tcpip\Parameters\keepalivetime
setting DWORD:DECIMAL 7200000 for servers that need it

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER