BackupPC-users

Re: [BackupPC-users] Parallelizing operations on the host to backup

2015-03-09 09:11:41
Subject: Re: [BackupPC-users] Parallelizing operations on the host to backup
From: Andrea Carpani <andrea.carpani AT dnshosting DOT it>
To: backuppc-users AT lists.sourceforge DOT net
Date: Mon, 09 Mar 2015 14:07:41 +0100
On 09/03/2015 08:45, Adam Goryachev wrote:
>
> On 9/03/2015 18:28, Andrea Carpani wrote:
>> Hello,
>>
>> is there an easy way to parallelize the rsync operations on the host to
>> beckup?
>>
>> Here's my problem: the host I need to back up has a /home of about 300Gb
>> full of really small files (10 millions maybe). Each scan of the file
>> system takes a really long time. The storage I use for it, though, is
>> quite fast and has a lot of disks, thus parallelizing the operations
>> (say, 1 rsync for each user under /home/* and 10 parallel threads) would
>> ideally shorten the scan time by a factor of 10 given a powerful enough
>> backup server.
>>
>> I know I could create several backuppc "hosts" all pointing to the same
>> hostname, and one backs up a subset of users, but maybe there's some
>> other parameter I could use.
>>
>> Regards,
>>
> I can't be sure, but I suspect that if anything you would increase the
> backup time by doing this. Given rsync will traverse the directory tree
> at disk speed (combination of both the backup server disk speed and the
> backup client disk speed). Trying to read (or write) to multiple
> portions of a disk at the same time will take longer than a single read.

The storage setup is quite complex and the storage at the botom can 
parallelize quite well. I know for sure that several tarballs in 
parallel can run faster that a single tarball for example.

>
> Unless, of course, you have /home/[a-d] on one disk, /home/[e-j] on a
> separate disk, etc.
>
> Assuming /home is a single raid array comprised of multiple spinning
> disks (not SSD).
>
> You could always test this by manually running a couple of rsync against
> various directories, storing the output on the backuppc server.
>
> Other things that can have a big impact on this:
> 1) Check your backup server and backup client do not modify the atime
> when doing the backup (eg, mount with noatime if you can and it is
> relevant to your filesystem format, and won't cause any problems for
> your applications)

Ok, I have noatime,nodiratime (almost) everywhere.

> 2) Check you have sufficient RAM (not swap) for the entire file list on
> both the backuppc server and client
The number of files is huge (~10 millions). How can I guess the 
estimated memory I need?

> 3) Enable rsync checksum-seed so that backuppc does less work while
> checking which files need to backed up

Ok. I'll check this.

Thanks.


>
> Hope the above helps, it is all limited to my experience/knowledge,
> don't take it as gospel.
>
> Regards,
> Adam
>
> ------------------------------------------------------------------------------
> Dive into the World of Parallel Programming The Go Parallel Website, sponsored
> by Intel and developed in partnership with Slashdot Media, is your hub for all
> things parallel software development, from weekly thought leadership blogs to
> news, videos, case studies, tutorials and more. Take a look and join the
> conversation now. http://goparallel.sourceforge.net/
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/

-- 
.a.c.
Andrea Carpani
andrea.carpani AT dnshosting DOT it
+39 329 722 1309


------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>