Networker

Re: [Networker] splitting up large directories/parallelizing

2010-03-04 16:14:25
Subject: Re: [Networker] splitting up large directories/parallelizing
From: David Gold-news <dave2 AT CAMBRIDGECOMPUTER DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 4 Mar 2010 16:11:56 -0500
Hi,

We wrote something like this a few years ago (c. 2005), after finding out that the official support stance was that wildcards ("*") in savesets was not supported. (Prior to that, it appeared that they were supported--IIRC, there was a change to the man page or documentation which prompted this)

Our solution looked like the following:
Client object #1:
--saveset: top level directory (f:\data1);support of multiple top level directories is done by copying the client and modifying the saveset to the new one--ending up with multiple clients in the same group, each with different top level directories as the saveset. --backup command: savedas.exe -z10 (where the z argument indicated the number of parallel streams to run concurrently)

Client object #2:
--saveset: All
--Directive: one that skips the savesets used in client object #1

The logic was that we could get linear increases in backup speed by increasing parallelism, up to what was a logical parallelism.

Limitations that we ran into:
1. Stopping the group didn't stop the save processes on the client; mostly that was an effort/benefit calculation, and not a technical one. 2. Logging was mostly on the client, although we did pass return information back to savegrp in the proper format. 3. Restores required changing browse time to the backup time of each specific subdirectory; this is because networker shows the last saveset for the relevent drive--and because savedas.exe creates multiple subtree backups of a given saveset, it had problems figuring out the most recent time. (Once we found one interation, "show versions" took care of finding the rest though)

I don't remember us having issues with incrementals, though. What kind of problems did you run into there?

--Dave

Date:    Wed, 3 Mar 2010 15:12:15 +0100
From:    Ronny Egner <RonnyEgner AT GMX DOT DE>
Subject: Splitting up large directory save jobs into smaller pieces (aka. "parallelizing")

Hi List,

One customer asked me if it is possible to speed up saves of large
directories with many sub-dirs (or files). Unfortunately Networker itself
cannot automatically parallelize saving of directories. Thats quite odd.

So i wrote  simple  script bash script  in order to split up large
directories in  smaller (and parallel) save jobs.

===================================
David Gold
Sr. Technical Consultant
Cambridge Computer Services, Inc.
Artists in Data Storage
Tel: 781-250-3000
Tel (Direct): 781-250-3260
Fax: 781-250-3360
dave AT cambridgecomputer DOT com
www.cambridgecomputer.com

===================================
 ----------------------------------------------------------------------------
*Any ideas, suggestion or comments are mine alone, and are not of my company*
To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Networker] splitting up large directories/parallelizing, David Gold-news <=