Sorry about the lag with returning to this issue - other pressing projects...
I attempted to run the group with the Inactivity timeout set to the max, which had no effect, next step is turning it off completely. Also working with our networking team to make triple check anything on that end. I'm also wondering about server system limitations...this is also running on RHEL 3 with a
2.4.21-37.0.1.ELwithufs SMP kernel, two dual core Xeon 2.4 GHz proc, 4 GB RAM.
Clients exhibiting this issue are Windows (7.3.2.Build.364/7.2.Build.494),
Linux (7.2.Build.494) and Solaris (7.2.2.Build.422) clients (the Solaris client
seems to be the most egregious - can't even get that one to back up manually
from the server).
Anyone with any new hints or suggestions would be greatly appreciated.
Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business
212-998-0746
------------------------------
------------------------------
Date: Wed, 10 Jun 2009 17:05:28 -0400
From: Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem
Update: I ran the closest server in question from the command
line(savegrp -v -c ie-1 -l full GROUP), and received:
* ie-1:/yesterday lost connection to server, exiting
This is despite the fact that I'm seeing the keepalive notices in
nwadmin every 5 minutes as expected.
Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business
212-998-0746
------------------------------
Date: Wed, 10 Jun 2009 12:21:38 -0400
From: Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem
James,
These are not NDMP, and some are through a firewall, some are not. The
reason that it is so is that some are backing up over a VLAN from a
datacenter ~1 mile away. However, the biggest issues have been with the
server that is adjacent to the backup server.
To expand on the NSR_KEEPALIVE_WAIT, this did appear to have an effect
when I invoke savegrp manually (I see the "group running on client"
messages in nwadmin), but not when run automatically. It was invoked
because of the large files problem that predated me and predated the
firewall hardware. It probably even predated the datacenter move that
separated our backup hardware from some of our systems.
David,
Thanks for the tip - I had bumped the inactivity timeout to 90, but I'll
crank it all the way out. The savesets are listed as aborted, and often
have significantly divergent sizes (>5 GB) among the retries, even
though the server mountpoint is static throughout the day.
Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business
212-998-0746
------------------------------
Date: Wed, 10 Jun 2009 11:01:21 -0500
From: James T Proctor <jproctor AT USGS DOT GOV>
Subject: Re: Networker 7.2.2 clients with large files problem
Are these ndmp? Are they going through a firewall?
Jim Proctor
IT Specialist
USGS/NGTOC III
Rolla, Missouri
jproctor AT usgs DOT gov
(573)308-3521
------------------------------
Date: Wed, 10 Jun 2009 09:00:57 -0700
From: "Leiss, Jeffrey" <Jeff.Leiss AT NWDC DOT NET>
Subject: Re: Networker 7.2.2 clients with large files problem
Alternatively, you can set this to 0, and networker will ignore it altogether
for those clients.
------------------------------
Date: Wed, 10 Jun 2009 10:58:51 -0500
From: "Browning, David" <DBrown AT LSUHSC DOT EDU>
Subject: Re: Networker 7.2.2 clients with large files problem
Check the "inactivity timeout" field in the group definitions.
For our large servers, we put this to 1000, and that seems to work for
us.
Also, check your indexes to make sure that it isn't really completing.
Sometimes you can get that message, but you will find that the backup
successfully finished, and is stored in the index.
David M. Browning Jr.
IT Project Coordinator Enterprise Backups and Help Desk
------------------------------
Date: Wed, 10 Jun 2009 11:45:37 -0400
From: Greg Etling <getling AT STERN.NYU DOT EDU>
Subject: Networker 7.2.2 clients with large files problem
Hello,
I am running into a problem with scheduled backups on a Networker 7.2.2
server. By setting NSR_KEEPALIVE_WAIT=300 on the clients, I am able to
run full backups of the systems in question but the backup fails when
the scheduled backup runs. The common thread for these systems is the
existence of large files (> 1GB), and they are failing their scheduled
backups the same way as they used to before the keepalive was added:
--- Unsuccessful Save Sets ---
* ie-1:/yesterday 1 retry attempted
I'll run a verbose save tonight to get more details, but I'm curious
what other next steps might be recommended given that it is only a
problem for scheduled backups.
And yes, I know I need to upgrade - working on that as well. Thanks.
Greg
---
Greg Etling <getling AT stern.nyu DOT edu <mailto:getling AT stern.nyu DOT edu>>
Systems Administrator
Stern IT Enterprise Operations
NYU Stern School of Business
212-998-0746
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
------------------------------
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|