Get a hold of tcpview from www.sysinternals.com. This will show you the state
of the connections between client and server. Set it running on both sides,
ideally from within Terminal Services so you can easily switch back and forth.
My guess is that the connection's not shutting down correctly (if I'm correctly
interpreting that you can recover the entire file), which is likely to be an
issue in the TCP stack and requiring a patch from MS. If the whole thing's
recoverable then it's not the inactivity timeout (unless you then go on to try
to back up the rest of the filesystem on an incremental on a filesystem which
doesn't change much).
The tcp_keepalive_interval only works on established connections, and only
kicks in after the socket has been idle for <interval> seconds. It attempts to
contact the socket on the other side 10 times, and if there's no response in
that time it will reset the local connection only. I've seen firewalls shut
down sockets in under this time and just block all further traffic, but that
would break your save.
If it's purely the index save which is failing to start, then it's likely
because NW sees that socket as still open and therefore the save as not
finished.
HTH,
Stuart.
________________________________
From: Legato NetWorker discussion on behalf of Riaan Louwrens
Sent: Thu 10-Aug-06 19:18
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] Dropped Connection Ports
Hi All,
Slow link so no opportunity to go rumage the archives - so please forgive if
this has been discussed before (which I know it has).
I have an issue where "large" file backups eventually time out (i.e. a 6 hour
200GB sql single file sql dump).
>From memory I know that the "tcp\ip keepalivetime" parameter needs to be set.
>this has been done and the server (client) in question rebooted. The next
>backup failed in the same way.
I am sure this is a connection port only problem as the saveset is there ans a
saveset recovery does work - so it is only the last bit (index entry) that is
failing. Memory doesnt server me so well anymore and I have forgotten whether
this last bit get initiated by the client or the server (i.e. where to try and
look for the ports being closed down).
This is only a rather old client (7.1.1 - which we are upgrading to 7.1.4),
there is unfortunately no option to go 7.2 or even 7.3 (their management want a
full proof of cocept testing ... etc etc ... ).
It is Windows 2000 (across the board), latest SP - with NO funny hotfixes that
"fix" the keepalive time to 300 milliseconds (been there, solved that).
It is on the same level 3 switch (same switch , different vlan's). With Gbit
throughout (large SQL module backups form other clients arent an issue. This
particular client HAS to backup via the dump - while we convince their
management in the wisdom of going BSM)...
Any thoughts / suggestions?
Your help, as always, is appreciated!
Regards,
Riaan
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the
body of the email. Please write to networker-request AT listserv.temple DOT edu
if you have any problems
wit this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|