Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)
2007-09-02 14:37:57
Brian O'Neill wrote:
Yaron Zabary wrote:
My bet is that the file system of the adv_file (ext2/ext3) is having
a problem with allocating new blocks to the saveset. What do you see
with vmstat ? Try
I think this got truncated, but I didn't see anything particularly
unusual. Filesystem is ext3, and was only about 11% full.
That was a typo. Anyhow, I would expect that your adv_file might have
a problem with writing to a large file. How many sessions do you run ?
Come to think of it, we had a Dell machine which had some issues with
PERC anc RedHat (it crashed under heavy load). Did you try to strace the
nsrmmd ?
I have about 10 Netapp filesystems I will be backing up via NFS,
which it is intended will happen on the Networker server. Should I be
backing them up via other hosts instead?
No. you should invest some money in NDMP licenses and backup over the
network. It works much better than NFS. You should have some
processing power on the server for running this, but other than that,
it is a better solution.
The client decided not to pursue using NDMP due to the added cost, and
was not concerned with the performance issue of NFS - but it was not
expected to run this poorly. And for reference I'm backing up the
nightly.0 snapshots.
It is simply amazing to see people that buy a NetApp at 50k$ or more
and then save some K$ on the NDMP license.
FYI, as I mentioned in my last e-mail, I backed out to 7.3.2 Build 11.
The Oracle FS had no problem, but when another group kicked it, the
nsrmmd kit 99% CPU again. Eventually I decided to kill the nsrmmd and
let Networker restart it. It did so, but took a little while before it
would use the adv_file volume again. Once it did, it kicked back to full
speed again. But after a while, it slowed back into the sub-1KB/s range
again, although nsrmmd was not using CPU. When the Oracle dump backups
kicked in again, it went full speed and completed that pretty quickly,
then slowed down again, with only two NFS filesystems being backed up.
At times it would get a speed burst, but it wouldn't last long. It did
finally finish though.
My THEORY is that the slow speed is caused when the ext3 code trying
to allocate more blocks to the saveset's file. Try to run four
concurrent 'dd if=/dev/zero of=/advfs/somefile bs=1m count=200000' and
see if you can still see the performance decrease.
-Brian
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
--
-- Yaron.
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|
|
|