Networker

Re: [Networker] Dropped Connection Ports

2006-08-11 09:21:54
Subject: Re: [Networker] Dropped Connection Ports
From: Stuart Whitby <swhitby AT DATAPROTECTORS.CO DOT UK>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 11 Aug 2006 14:11:55 +0100
Get a hold of tcpview from www.sysinternals.com.  This will show you the state 
of the connections between client and server.  Set it running on both sides, 
ideally from within Terminal Services so you can easily switch back and forth.  
My guess is that the connection's not shutting down correctly (if I'm correctly 
interpreting that you can recover the entire file), which is likely to be an 
issue in the TCP stack and requiring a patch from MS.  If the whole thing's 
recoverable then it's not the inactivity timeout (unless you then go on to try 
to back up the rest of the filesystem on an incremental on a filesystem which 
doesn't change much).
 
The tcp_keepalive_interval only works on established connections, and only 
kicks in after the socket has been idle for <interval> seconds.  It attempts to 
contact the socket on the other side 10 times, and if there's no response in 
that time it will reset the local connection only.  I've seen firewalls shut 
down sockets in under this time and just block all further traffic, but that 
would break your save.
 
If it's purely the index save which is failing to start, then it's likely 
because NW sees that socket as still open and therefore the save as not 
finished.
HTH,
 
Stuart.

________________________________

From: Legato NetWorker discussion on behalf of Riaan Louwrens
Sent: Thu 10-Aug-06 19:18
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] Dropped Connection Ports



Hi All,

Slow link so no opportunity to go rumage the archives - so please forgive if 
this has been discussed before (which I know it has).

I have an issue where "large" file backups eventually time out (i.e. a 6 hour 
200GB sql single file sql dump).

>From memory I know that the "tcp\ip keepalivetime" parameter needs to be set. 
>this has been done and the server (client) in question rebooted. The next 
>backup failed in the same way.

I am sure this is a connection port only problem as the saveset is there ans a 
saveset recovery does work - so it is only the last bit (index entry) that is 
failing. Memory doesnt server me so well anymore and I have forgotten whether 
this last bit get initiated by the client or the server (i.e. where to try and 
look for the ports being closed down).

This is only a rather old client (7.1.1 - which we are upgrading to 7.1.4), 
there is unfortunately no option to go 7.2 or even 7.3 (their management want a 
full proof of cocept testing ... etc etc ... ).

It is Windows 2000 (across the board), latest SP - with NO funny hotfixes that 
"fix" the keepalive time to 300 milliseconds (been there, solved that).

It is on the same level 3 switch (same switch , different vlan's). With Gbit 
throughout (large SQL module backups form other clients arent an issue. This 
particular client HAS to backup via the dump - while we convince their 
management in the wisdom of going BSM)...

Any thoughts / suggestions?

Your help, as always, is appreciated!

Regards,
Riaan

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER



To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu 
if you have any problems
wit this list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>