Networker

Re: [Networker] aborted due to inactivity

2004-01-07 12:23:09
Subject: Re: [Networker] aborted due to inactivity
From: Matt Temple <mht AT RESEARCH.DFCI.HARVARD DOT EDU>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Wed, 7 Jan 2004 12:23:01 -0500
R C wrote:
I just dealt with a very similar problem.  What it appeared to be for me is
dropped frames (as observed through Network Monitor on w2k).  Once I
reduced the LARGE number of dropped frames I was observing, everything was
back to normal.  You may want to monitor some of your network statistics at
the times where groupX begins to see if you see a dramatic increase in the
number of lost and/or dropped frames.
Ron
On Tue, 6 Jan 2004 14:11:39 -0500, Joel Fisher <jfisher AT WFUBMC DOT EDU> 
wrote:



Just an additional comment here.   I wrote up an instance of inactivity
timeout here last week.   In addition to timing out, I found that the
client (a Linux machine with a Reiser FS) was effectively hung -- it
could be pinged but no more than that.  No error messages -- I'm
guessing the CPU was being entirely consumed.   After rebooting, I
looked at the Networker client version, which turned out to be old --
6.0.2.   After updating to 7.1, the backup went as expected.


                                       Matt Temple

Hey Robert,

When the backup is started from the server it does fail a 42GB then
restarts and fails at 42GB again.  No firewall involved.  I've kind of
ruled out a network problem, because this is the only one of about 20
large(100GB) volumes on the cluster that fails on a consistent basis.

Still searching....

Thanks,

Joel

When you say it always fails at the 42GB mark; does it always fail after
the same amount of time?  If you have serveral retries set, do you see
it
back up 42Gb, then timeout, then back up 42GB again?

If so, this would indicate an external severing of the control
connection..

First question from Legato will be "is there a firewall involved?"

Robert Maiello
Thomson Healthcare

On Mon, 5 Jan 2004 14:17:35 -0500, Joel Fisher <jfisher AT WFUBMC DOT EDU>
wrote:


Hey All,



I'm wondering if anyone might be able to shed some light on a problem
I've be troubleshooting.



Server: Sun E450/Solaris2.6/SBU 6.1.3

Client: Compaq/Win 2K3 cluster/Legato 7.0 client



I get the below failure on full backups(incremental backups complete
successfully).



* client1.nt.wfubmc.edu:P: 1 retry attempted

* client1.nt.wfubmc.edu:P: 01/02/04 08:53:23 nsrexec: Attempting a kill
on remote save

* client1.nt.wfubmc.edu:P: 01/02/04 08:58:23 nsrexec: Attempting a kill
on remote save

* client1.nt.wfubmc.edu:P: write: Broken pipe

* client1.nt.wfubmc.edu:P: aborted due to inactivity



I've increase the savegroup timeout to 360 and it made no difference.



I can run the backup directly from client1 successfully, but when I
start it from the backup server it fails.  When it fails it always

seems

to fail at 42GB.  So I thought maybe there as a bad file or something
like that causing it to hang, but when it ran successfully directly

from

the client that killed that theory.  I'm kind of stumped right now.



Anyone ever have a similar problem?



Thanks,



Joel










--
Note: To sign off this list, send a "signoff networker" command via

email

to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via
email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=


--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=


--
=============================================================
Matthew Temple                Tel:    617/632-2597
Director, Research Computing  Fax:    617/582-7820
Dana-Farber Cancer Institute  mht AT research.dfci.harvard DOT edu
44 Binney Street, LG300/300   http://research.dfci.harvard.edu
Boston, MA 02115              Choice is the Choice!

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=