Networker

Re: [Networker] Networker 7.2.2 clients with large files problem

2009-06-17 13:01:34
Subject: Re: [Networker] Networker 7.2.2 clients with large files problem
From: Greg Etling <getling AT STERN.NYU DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 17 Jun 2009 12:55:42 -0400
Sorry about the lag with returning to this issue - other pressing projects...

I attempted to run the group with the Inactivity timeout set to the max, which had no effect, next step is turning it off completely. Also working with our networking team to make triple check anything on that end. I'm also wondering about server system limitations...this is also running on RHEL 3 with a 2.4.21-37.0.1.ELwithufs SMP kernel, two dual core Xeon 2.4 GHz proc, 4 GB RAM.
Clients exhibiting this issue are Windows (7.3.2.Build.364/7.2.Build.494), 
Linux (7.2.Build.494) and Solaris (7.2.2.Build.422) clients (the Solaris client 
seems to be the most egregious - can't even get that one to back up manually 
from the server).

Anyone with any new hints or suggestions would be greatly appreciated.
Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business

212-998-0746
------------------------------

------------------------------

Date:    Wed, 10 Jun 2009 17:05:28 -0400
From:    Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem

Update: I ran the closest server in question from the command line(savegrp -v -c ie-1 -l full GROUP), and received:

* ie-1:/yesterday lost connection to server, exiting

This is despite the fact that I'm seeing the keepalive notices in nwadmin every 5 minutes as expected.

Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business

212-998-0746
------------------------------

Date:    Wed, 10 Jun 2009 12:21:38 -0400
From:    Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem

James,

These are not NDMP, and some are through a firewall, some are not. The reason that it is so is that some are backing up over a VLAN from a datacenter ~1 mile away. However, the biggest issues have been with the server that is adjacent to the backup server.

To expand on the NSR_KEEPALIVE_WAIT, this did appear to have an effect when I invoke savegrp manually (I see the "group running on client" messages in nwadmin), but not when run automatically. It was invoked because of the large files problem that predated me and predated the firewall hardware. It probably even predated the datacenter move that separated our backup hardware from some of our systems.

David,

Thanks for the tip - I had bumped the inactivity timeout to 90, but I'll crank it all the way out. The savesets are listed as aborted, and often have significantly divergent sizes (>5 GB) among the retries, even though the server mountpoint is static throughout the day.

Greg

---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business

212-998-0746

------------------------------

Date:    Wed, 10 Jun 2009 11:01:21 -0500
From:    James T Proctor <jproctor AT USGS DOT GOV>
Subject: Re: Networker 7.2.2 clients with large files problem

Are these ndmp? Are they going through a firewall?

Jim Proctor
IT Specialist
USGS/NGTOC III
Rolla, Missouri
jproctor AT usgs DOT gov
(573)308-3521

------------------------------

Date:    Wed, 10 Jun 2009 09:00:57 -0700
From:    "Leiss, Jeffrey" <Jeff.Leiss AT NWDC DOT NET>
Subject: Re: Networker 7.2.2 clients with large files problem

Alternatively, you can set this to 0, and networker will ignore it altogether 
for those clients.
------------------------------

Date:    Wed, 10 Jun 2009 10:58:51 -0500
From:    "Browning, David" <DBrown AT LSUHSC DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem

Check the "inactivity timeout" field in the group definitions.
For our large servers, we put this to 1000, and that seems to work for
us.
Also, check your indexes to make sure that it isn't really completing.
Sometimes you can get that message, but you will find that the backup
successfully finished, and is stored in the index.
David M. Browning Jr.
IT Project Coordinator Enterprise Backups and Help Desk
------------------------------

Date:    Wed, 10 Jun 2009 11:45:37 -0400
From:    Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Networker 7.2.2 clients with large files problem

Hello,

I am running into a problem with scheduled backups on a Networker 7.2.2 server. By setting NSR_KEEPALIVE_WAIT=300 on the clients, I am able to run full backups of the systems in question but the backup fails when the scheduled backup runs. The common thread for these systems is the existence of large files (> 1GB), and they are failing their scheduled backups the same way as they used to before the keepalive was added:

--- Unsuccessful Save Sets ---

* ie-1:/yesterday 1 retry attempted


I'll run a verbose save tonight to get more details, but I'm curious what other next steps might be recommended given that it is only a problem for scheduled backups.

And yes, I know I need to upgrade - working on that as well. Thanks.

Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business

212-998-0746

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

------------------------------

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER