Networker

Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)

2007-09-02 14:37:57
Subject: Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)
From: Yaron Zabary <yaron AT ARISTO.TAU.AC DOT IL>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Sun, 2 Sep 2007 21:33:16 +0300
Brian O'Neill wrote:

Yaron Zabary wrote:

My bet is that the file system of the adv_file (ext2/ext3) is having a problem with allocating new blocks to the saveset. What do you see with vmstat ? Try


I think this got truncated, but I didn't see anything particularly unusual. Filesystem is ext3, and was only about 11% full.

That was a typo. Anyhow, I would expect that your adv_file might have a problem with writing to a large file. How many sessions do you run ? Come to think of it, we had a Dell machine which had some issues with PERC anc RedHat (it crashed under heavy load). Did you try to strace the nsrmmd ?


I have about 10 Netapp filesystems I will be backing up via NFS, which it is intended will happen on the Networker server. Should I be backing them up via other hosts instead?

No. you should invest some money in NDMP licenses and backup over the network. It works much better than NFS. You should have some processing power on the server for running this, but other than that, it is a better solution.

The client decided not to pursue using NDMP due to the added cost, and was not concerned with the performance issue of NFS - but it was not expected to run this poorly. And for reference I'm backing up the nightly.0 snapshots.

It is simply amazing to see people that buy a NetApp at 50k$ or more and then save some K$ on the NDMP license.


FYI, as I mentioned in my last e-mail, I backed out to 7.3.2 Build 11. The Oracle FS had no problem, but when another group kicked it, the nsrmmd kit 99% CPU again. Eventually I decided to kill the nsrmmd and let Networker restart it. It did so, but took a little while before it would use the adv_file volume again. Once it did, it kicked back to full speed again. But after a while, it slowed back into the sub-1KB/s range again, although nsrmmd was not using CPU. When the Oracle dump backups kicked in again, it went full speed and completed that pretty quickly, then slowed down again, with only two NFS filesystems being backed up.

At times it would get a speed burst, but it wouldn't last long. It did finally finish though.

My THEORY is that the slow speed is caused when the ext3 code trying to allocate more blocks to the saveset's file. Try to run four concurrent 'dd if=/dev/zero of=/advfs/somefile bs=1m count=200000' and see if you can still see the performance decrease.


-Brian

To sign off this list, send email to listserv AT listserv.temple DOT edu and type "signoff networker" in the body of the email. Please write to networker-request AT listserv.temple DOT edu if you have any problems with this list. You can access the archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER


--

-- Yaron.

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER