Re: [Networker] splitting up large directories/parallelizing

Hi,

We wrote something like this a few years ago (c. 2005), after findingout that the official support stance was that wildcards ("*") insavesets was not supported. (Prior to that, it appeared that theywere supported--IIRC, there was a change to the man page ordocumentation which prompted this)


Our solution looked like the following:
Client object #1:

--saveset: top level directory (f:\data1);support of multiple toplevel directories is done by copying the client and modifying thesaveset to the new one--ending up with multiple clients in the samegroup, each with different top level directories as the saveset.--backup command: savedas.exe -z10 (where the z argument indicatedthe number of parallel streams to run concurrently)


Client object #2:
--saveset: All
--Directive: one that skips the savesets used in client object #1

The logic was that we could get linear increases in backup speed byincreasing parallelism, up to what was a logical parallelism.


Limitations that we ran into:

1. Stopping the group didn't stop the save processes on the client;mostly that was an effort/benefit calculation, and not a technical one.2. Logging was mostly on the client, although we did pass returninformation back to savegrp in the proper format.3. Restores required changing browse time to the backup time of eachspecific subdirectory; this is because networker shows the lastsaveset for the relevent drive--and because savedas.exe createsmultiple subtree backups of a given saveset, it had problems figuringout the most recent time. (Once we found one interation, "showversions" took care of finding the rest though)

I don't remember us having issues with incrementals, though. Whatkind of problems did you run into there?


--Dave

Date:    Wed, 3 Mar 2010 15:12:15 +0100
From:    Ronny Egner <RonnyEgner AT GMX DOT DE>

Subject: Splitting up large directory save jobs into smaller pieces(aka. "parallelizing")


Hi List,

One customer asked me if it is possible to speed up saves of large
directories with many sub-dirs (or files). Unfortunately Networker itself
cannot automatically parallelize saving of directories. Thats quite odd.

So i wrote  simple  script bash script  in order to split up large
directories in  smaller (and parallel) save jobs.

===================================
David Gold
Sr. Technical Consultant
Cambridge Computer Services, Inc.
Artists in Data Storage
Tel: 781-250-3000
Tel (Direct): 781-250-3260
Fax: 781-250-3360
dave AT cambridgecomputer DOT com
www.cambridgecomputer.com

===================================
 ----------------------------------------------------------------------------

*Any ideas, suggestion or comments are mine alone, and are not of my company*

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER