Networker

Re: [Networker] Backup of a storage node fails

2009-04-06 01:35:02
Subject: Re: [Networker] Backup of a storage node fails
From: Preston de Guise <enterprise.backup AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 6 Apr 2009 15:30:05 +1000
On 06/04/2009, at 15:11 , kurianv wrote:

Hi,
We have Networker 7.2 running on windows. We have a storage node which is located sumwat away from our site where there are 3 servers to backup. Usually the network connection between our site and the storage node site is really slow or sometimes even breaks . The problem what we notice is that when the backup is initiated from backup server from our site and when the backup is working normally for the storage node, when the network connection breaks or becomes slow the backup fails marking the tape full in the storage node. I dont know why this happens often. I know there is no problem with the tape library.

I dont know the exact logic behind this , because once the backup is initiated from our backup server then why does the backup which is taking place in the storage node fails.....Any ideas?

The errors what i get is listed below :

1) Lost network connection to host


2) RPC Network connection not available

and as a result it becomes media verification failes....

Your suggestions wud be valuable...

First, I'd suggest it's time you upgrade away from 7.2. Yes, the 7.2.x tree was a good tree, but it's too old now. It's been unsupported by EMC since mid 2008, meaning you're trusting your data protection to a now untrustable configuration. Do you really want to do that?

Regardless of whether backups are running to a local storage node, you have meta-data always running between the clients and the server - control communications plus index related data. Additionally, the server maintains a heart beat (albeit a slow one) with the storage node nsrmmd processes.

You could start by tweaking the following settings on the backup server resource itself:

nsrmmd polling interval – number of minutes between checks done by the server to make sure nsrmmd is still running nsrmmd restart interval – number of minutes NetWorker waits between restart attempts of a failed nsrmmd nsrmmd control timeout – number of minutes NetWorker waits for storage node requests/updates

If you have a slow link, double these as a starting point (from 3, 2 and 5 respectively to 6, 4, and 10).

However, if you have a link that outright fails, there's only so much that can be done – if it fails while NetWorker needs it to not fail, it's going to cause you a problem no matter what. I'd suggest though that by increasing those nsrmmd intervals, you may be able to at least somewhat reduce the frequency of media being marked as full. If the backup is becoming slow, then unless you're using the NetWare v4.x client, it's likely to be because the link is slow enough that even index (meta-data) comms is being affected. If that's the case, you'll need to either increase link performance and stability or look at re- architecting the configuration.

Cheers,

Preston.


--
Preston de Guise


"Enterprise Systems Backup and Recovery: A Corporate Insurance Policy":

http://www.amazon.com/Enterprise-Systems-Backup-Recovery-Corporate/dp/1420076396

http://www.enterprisesystemsbackup.com

NetWorker blog: http://nsrd.wordpress.com


To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER