Networker

[Networker] Inactivity timeout (and timeout in general)

2005-02-23 13:08:29
Subject: [Networker] Inactivity timeout (and timeout in general)
From: Craig Ruefenacht <craig.ruefenacht AT US.USANA DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 23 Feb 2005 11:07:27 -0700
Hi,

Late last week we made a change to a firewall that sits between our
Networker server (v6.1.3) and a few clients.  The change has to do with
established connection timeout - if an established TCP connection is
idle for X minutes, the firewall will drop the connection out of the
established table, without sending a reset packet to both endpoints.

What ends up happening is that the endpoints will still think the TCP
connection is open, and when a packet is sent over it, the firewall
doesn't see it as an established packet, and it isn't a SYN packet
(meaning a new connection), so the firewall ignores the packet.

Back to the Networker issue.  The change on the firewall changed the
timeout for connections from 72 hours to 30 minutes.  Once that change
occurred, I started noticing that the clients that were on the other
side of the firewall from our Networker server would successfully send
their data to the Networker server, but at the end of the save, the
Networker server reports that the backup failed.

Here is a snippet from our daemon.log from last night:

02/23/05 02:33:22 nsrd: polaris:/data done saving to pool
'Unix' (TT0030) 151 MB
* polaris:/data ! no output
02/23/05 03:02:05 savegrp: polaris:/data will retry 1 more time(s)
02/23/05 03:02:40 nsrd: polaris:/data saving to pool 'Unix' (TT0029)

...

02/23/05 04:30:46 nsrd: polaris:/data done saving to pool
'Unix' (TT0029) 152 MB
* polaris:/data 1 retry attempted
02/23/05 05:03:59 savegrp: polaris:/data will retry 0 more time(s)


The amount of data saved from polaris:/data is about what I expect (152
Mbyte).

Notice that it saved it once, and then thought it failed 30 minutes
after it finished writing to tape, so it saved it again.  On the second
try, the data got saved, then 30 minutes later Networker reported it as
failed.  I've verified that the data made it to tape by doing some test
restores. 

My question is, is there any TCP/UDP connection between the Networker
server and client that should remain open, even though the connection
could have no traffic on it, for the duration of the backup?

What I'm guessing is happening is that there is a TCP connection that
gets established at the beginning of a saveset which is expected to
still be open at the end of the save.  This TCP connection isn't used to
send the data-to-be-backed-up to the Networker server.  At the end of
the save, the client lets the Networker server know that its done (or
something like that) using this TCP connection.  In our case now, if
that TCP connection has no traffic flowing across it for 30 minutes, the
firewall drops it as an established connection, but the Networker server
and client still think the connection is open, and tries to use it at
the end of the save.

Can anyone shed any light on whether my thinking is correct, or whether
there may be something else at play here?

There have been no changes to neither the Networker server (HP-UX 11i,
Networker 6.1.3) and clients (Linux/Microsoft/Solaris).  I've restarted
the Networker server and client software several times, though haven't
rebooted any of the boxes.  There were no changes on the firewall except
for the connection timeout value.  All of our other clients are (and
have been) backing up fine.  They, of course, do not have a firewall
between them and the Networker server.  No other problems with any other
connections have been noted, except that idle ssh/telnet/ftp connections
keep getting disconnected after 30 minutes.

Sorry for the long winded explanation of our problem.


-- 
Craig Ruefenacht
UNIX Systems Administrator
USANA Health Sciences
http://www.usana.com

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listserv.temple DOT edu or visit the list's Web site at
http://listserv.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>