Networker

Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)

2007-09-02 12:47:34
Subject: Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)
From: Yaron Zabary <yaron AT ARISTO.TAU.AC DOT IL>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Sun, 2 Sep 2007 19:44:17 +0300
My bet is that the file system of the adv_file (ext2/ext3) is having a problem with allocating new blocks to the saveset. What do you see with vmstat ? Try


Brian O'Neill wrote:
FYI, I backed out to 7.3.2 Jumbo (Build 11) (I don't have the later build available, and its no longer available for download).

It seemed to resolve the problem of backing up this host, but when another group ran that was backing up several local/NFS filesystems, performance dropped again, generally around 350KB/s, and nsrmmd using 99% CPU again. These filesystems backed up fine previously under 7.3.3.

Is there an adv_file limitation I'm not aware of that affect performance? Should I reduce the max targets?

I have about 10 Netapp filesystems I will be backing up via NFS, which it is intended will happen on the Networker server. Should I be backing them up via other hosts instead?

No. you should invest some money in NDMP licenses and backup over the network. It works much better than NFS. You should have some processing power on the server for running this, but other than that, it is a better solution.


Brian O'Neill wrote:
[Argh...I spent gathering info and typing this message, and I think I see the issue...more info at the bottom]

This is a new setup, and I'm at a loss as to what's happening - wondering if someone else has seen something similar.

Server is a Dell 1850 running Red Hat ES4 (32-bit) and Networker 7.3.3 with a PERC-connected RAID 5 array for adv_file storage.

Backup client is an older Dell 1750 running Red Hat ES4 (32-bit) with an Adaptec ASR-2130S-connected RAID 5 set with a filesystem containing 400GB of Oracle dump files. Only the dumps are being backed up. Client is 7.3.3.

Connection is via a private backup network, with a single gbit switch in between. Both hosts have their adapters at 1000/full, and are not showing any errors (I don't have insight to the switch yet - it's managed by the people I'm setting up the backup for, and its the long weekend).

Things start off at a steady rate of 35MB/s, with perhaps peaks in the 40MB/s range. Reasonable enough.

But at someone point - roughly an hour or about 140-150GB in to the backup, speeds suddenly drop to a VERY steady 146 +- 2KB/s. And with 250GB to go, VERY unacceptable.

Any thoughts as to what might be the cause of this degradation?

New: Now I'm noticing that on the backup server that one of the nsrmmd processes is eating up CPU...is this the problem on 7.3.3 that I've been hearing about? Is the only solution to downgrade to 7.3.2?

-Brian

To sign off this list, send email to listserv AT listserv.temple DOT edu and type "signoff networker" in the body of the email. Please write to networker-request AT listserv.temple DOT edu if you have any problems with this list. You can access the archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and type "signoff networker" in the body of the email. Please write to networker-request AT listserv.temple DOT edu if you have any problems with this list. You can access the archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER


--

-- Yaron.

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER