Networker

[Networker] RPC error: Remote system error

2004-10-19 03:42:00
Subject: [Networker] RPC error: Remote system error
From: Oscar Olsson <spam1 AT QBRANCH DOT SE>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Tue, 19 Oct 2004 09:41:02 +0200
I get the following error after a few nightly runs with some clients that
are behind a FW-1 NG firewall:

* client.domain.prod:All 1 retry attempted
* client.domain.prod:All nsrexec: authtype nsrexec
* client.domain.prod:All savefs: RPC error: Remote system error
* client.domain.prod:All savefs: Cannot access nsr server
`britt.qbranch.se'
  savefs client.domain.prod: failed.

This is from the group output. These clients always reach a point when
they fail after a few days/weeks. Then they keep failing UNTIL the
networker server software on the backup server is restarted. netstat does
not show any active connections for the hosts in question, so its not due
to some stale TCP connections.

According to the legato knowledge base, this might be the cause:

Solution Title: Error: 'savefs: RPC error: Remote system error'
Solution ID: legato12891

Here is the solution:
Schedule the backups for these systems during more quiet time
Heavy network activities may cause the NetWorker backup server not to have
enough time to respond to the request from the clients.


Here is the problem or goal:
Error: 'savefs: RPC error: Remote system error'

Receiving massive amounts or RPC errors

Cannot open nwadmin while they are getting the RPC errors

Errors appear mainly at night

Many other backups are running at the time when the RPC errors happen

Error: 'nsrexec: authtype nsrexec'

Error: 'savefs: Cannot access nsr server (server_name)


Problem Environment:
Windows 2000 sp2

NetWorker for Windows/NT

Very heavy network

Hosts entries has been added to both machines


Causes of this problem:
Heavy load on the network


Obviously, I don't buy this explanation, since its these clients, and
these clients only that keep failing after a while. Restarting the
networker server also solves the problem, which further indicates that
this is not network related. Since there are no active TCP sessions on the
backup server, this probably doesn't have anything to do with the
firewall. In general, a busy network shouldn't cause problems with
software, unless the software is flawed, or you have heavy packet loss or
extreme amounts of jitter.

One scenario that MIGHT be the cause, as far as I can tell, is if Legato
somehow starts to use ports that are beyond the specs in the networker
administrators guide after a while, until the server is restarted. But I
haven't checked that as of yet. For security reasons, the firewall is
managed by another company. Has anyone seen similar problems?

Oh, the servers are all Windows 2003 servers with client 7.1.1 and the
server is running 7.1.2 on Solaris. There is no name resolution issue
either, since the hosts file is used in both ends.

Comments? Ideas? Solution suggestions? :)

//Oscar

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>