Networker

Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)

2007-09-04 13:10:10
Subject: Re: [Networker] Backup speed drop with 7.3.3 (nsmmd problem?)
From: "Brian O'Neill" <oneill AT OINC DOT NET>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 4 Sep 2007 12:51:33 -0400
Yaron Zabary wrote:
That was a typo. Anyhow, I would expect that your adv_file might have a problem with writing to a large file. How many sessions do you run ? Come to think of it, we had a Dell machine which had some issues with PERC anc RedHat (it crashed under heavy load). Did you try to strace the nsrmmd ?

Well, the nsrmmd is pegged again during the Oracle backups (although this is occuring while I'm also running nsrstage, but that seems to be running fine). Here is an strace excerpt. Doesn't seem to be anything unusual there, but I'm not sure what to expect right now:

write(9, "77\2\0", 4)                   = 4
_llseek(9, 0, [18376041636], SEEK_CUR)  = 0
write(9, "\301\3\1\0\200\7\0xk\2\32\25\23\t\21\0SeekerSourcesDat"..., 27620) = 27620
time([1188924190])                      = 1188924190
select(1024, [3 5 8], NULL, NULL, {5, 0}) = 1 (in [8], left {5, 0})
_llseek(9, 0, [18376069256], SEEK_CUR)  = 0
write(9, "\0\0\1\0\0\1\0\4\0\0\0\0nodeId=\'12863522\']\1\0"..., 5028) = 5028
time([1188924191])                      = 1188924191
select(1024, [3 5 8], NULL, NULL, {5, 0}) = 1 (in [8], left {5, 0})
read(8, "ReleaseNode\36\0ReleaseNode[nodeId="..., 32768) = 32768
_llseek(9, 0, [18376074284], SEEK_CUR)  = 0
write(9, "Rele", 4)                     = 4
_llseek(9, 0, [18376074288], SEEK_CUR)  = 0
write(9, "aseNode\36\0ReleaseNode[nodeId=\'128"..., 32764) = 32764
time([1188924191])                      = 1188924191
select(1024, [3 5 8], NULL, NULL, {5, 0}) = 1 (in [8], left {5, 0})
read(8, "927a-ae828c72ab77\n\0en_us     \7\0x"..., 32768) = 32768
_llseek(9, 0, [18376107052], SEEK_CUR)  = 0
write(9, "927a", 4)                     = 4
_llseek(9, 0, [18376107056], SEEK_CUR)  = 0

My THEORY is that the slow speed is caused when the ext3 code trying to allocate more blocks to the saveset's file. Try to run four concurrent 'dd if=/dev/zero of=/advfs/somefile bs=1m count=200000' and see if you can still see the performance decrease.


I'll have to wait until everything completes to try that. I had _thought_ that adv_file would break up the savesets still, but that doesn't appear to be the case - I see 480GB file! Still, when I initially downgraded to 7.3.2 it created that file fine, and it did a 40GB backup in less than 19 minutes yesterday, but today its only up to 18GB after 1:50 and crawling...

I might have to actually call support *shudder*

-Brian

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER