Networker

Re: [Networker] Cloning parallelism

2008-01-23 11:38:48
Subject: Re: [Networker] Cloning parallelism
From: Yaron Zabary <yaron AT ARISTO.TAU.AC DOT IL>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 23 Jan 2008 18:29:05 +0200
Stan Horwitz wrote:
On Jan 23, 2008, at 9:19 AM, Goslin, Paul wrote:

Here Here !

I'm glad to see we are not the only ones who have noticed this lack of
multi-processor support. It's like having a HUMMER with an V8, but only
3 spark plugs, looks impressive , but don't expect it use all the power
that's available to it... Or accomplish as much as it could...

We've asked EMC about this, and the only suggestion they made was "try
increasing the server parallelism", which has made no measurable
difference that we can see. Then they closed the case without our
consent without answering our initial question: Why doesn't networker
utilize multiple processors ????

Perhaps if enough customers who desire that capability ask EMC to imbue NetWorker with multi-processor capability, EMC's product managers might be convinced that a valid business case exists to include that feature.

The lack of multiple processor support is a major issue for me. I run NetWorker 7.4 on Solaris 10. I have several NDMP clients that I back up using the history feature. When the index data is in the process of being sorted, it kills the processor on my 32-processor Sun T2000 is where everything involving NetWorker runs. As a result, I have a fast server that slows to a craw at least once a week because of NetWorker's uni-processor limitation.

I would be happy to compile the list of customers who want NetWorker to utilize multiple processors. I sure am one of them. Feel free to write to me privately via stan AT temple DOT edu and/or reply to this message publicly. There are several EMC people subscribed, so if you complain directly to this list, they will see your emails. What I will do is wait until next week and submit an RFE on this issue via PowerLink and I can attach a list of customers who need multi-processor capability if you speak up.

Funny, but Networker can and does utilize multiple processors. It runs a separate nsrmmd for each device, so if you have four LTO-3 drive I would expect each of them to be served by its own core on your typical quad core server. NDMP (DSA) has its own process for each save set, so it could use its own core as well. nsrindexd, nsrmmdbd are two more
processes which could use their own cores as well.

On our system (a Sparc 280R, two 750Mhz US3 processors, Solaris 10, four LTO-3 drives and three disk devices on a single ZFS LUN from an EMC AX150), I can see the following:

# sar -q

SunOS legato.tau.ac.il 5.10 Generic_125100-10 sun4u    01/23/2008

00:00:00 runq-sz %runocc swpq-sz %swpocc
00:10:02     4.4      35     0.0       0
00:20:01     3.8      43     0.0       0
00:30:02     3.6      43     0.0       0
00:40:02     4.2      61     0.0       0
00:50:03     4.3      73     0.0       0
01:00:03     3.9      69     0.0       0
01:10:04     5.1      85     0.0       0
01:20:05     5.2      86     0.0       0
01:30:03     4.2      49     0.0       0
01:40:02     3.8      41     0.0       0
01:50:01     3.6      42     0.0       0
02:00:01     3.7      45     0.0       0
02:10:02     4.2      61     0.0       0
02:20:09     4.1      63     0.0       0
02:30:02     4.5      64     0.0       0
02:40:06     5.2      62     0.0       0
02:50:00     2.2      68     0.0       0
03:00:00     1.6      43     0.0       0
03:10:01     1.9      37     0.0       0
03:20:00     2.1      67     0.0       0
03:30:01     2.8      71     0.0       0
03:40:00     3.8      80     0.0       0
03:50:01     2.0      45     0.0       0
04:00:01     2.3      64     0.0       0
04:10:01     2.8      66     0.0       0
04:20:01     3.5      84     0.0       0
04:30:01     3.0      79     0.0       0
04:40:00     2.3      73     0.0       0
04:50:01     2.2      72     0.0       0
05:00:01     2.7      73     0.0       0
05:10:01     1.8      61     0.0       0

The runq-sz is the number of runnable processes in the system. This is the same number which is used to calculate the load average which is shown by uptime and top. In the above example, on a quad core machine the work will be distributed to all cores.

Stan, about your T2000 problem. This is an 8 cores/32 threads CPU, so although you can run 32 processes which are assigned to logical CPUs, it is unclear if you are indeed going to see 32 times 1.2Ghz CPUs. Unfortunately, this CPU has a single FPU for all 8 cores which makes it an issue for floating point intensive applications. It might be that the nsrndmp_2fh process is FP intensive which saturates the single FP unit. If this is indeed the case, a multi threaded version will not help. In that respect, the newer (Niagara-2, T5x20) processors are better as they have a separate FPU for each core.


Regardless of my above statements, I will be happy to see certain Networker commands become multi-threaded where possible (and needed), so you can add my name to your list of customers who ask that RFE. I would suggest, however, that you will provide EMC with a list of commands which you think will benefit from such an effort (nsrndmp_2fh for example).


To sign off this list, send email to listserv AT listserv.temple DOT edu and type "signoff networker" in the body of the email. Please write to networker-request AT listserv.temple DOT edu if you have any problems with this list. You can access the archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER


--

-- Yaron.

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER