Networker

Re: [Networker] Saveset not completing normally

2007-12-21 12:32:57
Subject: Re: [Networker] Saveset not completing normally
From: Jon Fraley <jfraley AT glenraven DOT com>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 21 Dec 2007 12:29:11 -0500
We ran into a similar issue.  We were hitting a file size limit.  strace
helped me track it down.  We put "ulimit -f unlimited" in
the /etc/init.d/networker file and have not had an issue since.

Jon

On Thu, 2007-12-20 at 17:33 -0500, Clark, Patti wrote:
> Apologies and please ignore previous message.
> 
> Networker v7.3.3 both server and client are Red Hat Enterprise Linux v4
> and have been working (not flawlessly, but well).
> 
> I have a saveset that has started having problems completing normally.
> The following lines are from the notification output:
> 
> * lxdwtprod-gb.osti.gov:/var 1 retry attempted
> * lxdwtprod-gb.osti.gov:/var lost connection to server, exiting
> 
> The daemon.log output with more detail is as follows:
> 
> 12/19/07 20:30:06 nsrd: lxdwtprod-gb.osti.gov:/var saving to pool
> 'Production FS' (PFS.308)
> 12/19/07 21:08:04 nsrd: index notice: checking index for
> 'lxdwtprod-gb.osti.gov'
> 12/19/07 21:08:04 nsrd: index notice: /nsr/index/lxdwtprod-gb.osti.gov
> contains 1088650 records occupying 173 MB
> 12/19/07 21:10:09 nsrd: lxdwtprod-gb.osti.gov:/var done saving to pool
> 'Production FS' (PFS.308) 16 GB
> 12/19/07 21:10:09 savegrp: command ' save -s lxclyde-gb.osti.gov -g
> "Production Linux FS" -LL -f - -m lxdwtprod-gb.osti.gov -t 1198027808 -l
> incr -q -W 78 -N /var /var' for client lxdwtprod-gb.osti.gov exited with
> return code 1
> 12/19/07 21:10:09 savegrp: lxdwtprod-gb.osti.gov:/var failed.
> 12/19/07 21:10:09 savegrp: lxdwtprod-gb.osti.gov:/var will retry 1 more
> time(s)
> 12/19/07 21:10:09 nsrd: lxdwtprod-gb.osti.gov:/var saving to pool
> 'Production FS' (PFS.308)
> 12/19/07 21:50:15 savegrp: command ' save -s lxclyde-gb.osti.gov -g
> "Production Linux FS" -LL -f - -m lxdwtprod-gb.osti.gov -t 1198027808 -l
> incr -q -W 78 -N /var /var' for client lxdwtprod-gb.osti.gov exited with
> return code 1
> 12/19/07 21:50:15 nsrd: lxdwtprod-gb.osti.gov:/var done saving to pool
> 'Production FS' (PFS.308) 16 GB
> 12/19/07 21:50:15 savegrp: lxdwtprod-gb.osti.gov:/var failed.
> 12/19/07 21:50:15 nsrd: lxclyde.osti.gov:index:lxdwtprod-gb.osti.gov
> saving to pool 'Production FS' (PFS.308)
> 12/19/07 21:50:18 nsrd: lxclyde.osti.gov:index:lxdwtprod-gb.osti.gov
> done saving to pool 'Production FS' (PFS.308) 201 KB
> 
> This saveset started failing over the weekend on incrementals and level
> 5's.  I first tried a manual backup of only that saveset incrementally
> and had the same result.  Approx. 4GB would backup and then it would sit
> for 1-2 hours before aborting.  I then changed it to do a full backup
> which successfully performed to completion and that evening the normally
> scheduled incremental was successful.  Then again last night, the
> scheduled incremental failed.  Is this familiar to anyone? Ideas on what
> to tackle?
> 
> 
> Patti Clark
> Unix System Administrator - RHCT, GSEC
> Office of Scientific and Technical Information
> 
> 
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type "signoff networker" in the body of the email. Please write to 
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list. You can access the archives at 
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
> 

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER