BackupPC-users

Re: [BackupPC-users] Parallelizing operations on the host to backup

2015-03-09 03:46:47
Subject: Re: [BackupPC-users] Parallelizing operations on the host to backup
From: Adam Goryachev <mailinglists AT websitemanagers.com DOT au>
To: backuppc-users AT lists.sourceforge DOT net
Date: Mon, 09 Mar 2015 18:45:04 +1100

On 9/03/2015 18:28, Andrea Carpani wrote:
> Hello,
>
> is there an easy way to parallelize the rsync operations on the host to
> beckup?
>
> Here's my problem: the host I need to back up has a /home of about 300Gb
> full of really small files (10 millions maybe). Each scan of the file
> system takes a really long time. The storage I use for it, though, is
> quite fast and has a lot of disks, thus parallelizing the operations
> (say, 1 rsync for each user under /home/* and 10 parallel threads) would
> ideally shorten the scan time by a factor of 10 given a powerful enough
> backup server.
>
> I know I could create several backuppc "hosts" all pointing to the same
> hostname, and one backs up a subset of users, but maybe there's some
> other parameter I could use.
>
> Regards,
>

I can't be sure, but I suspect that if anything you would increase the 
backup time by doing this. Given rsync will traverse the directory tree 
at disk speed (combination of both the backup server disk speed and the 
backup client disk speed). Trying to read (or write) to multiple 
portions of a disk at the same time will take longer than a single read.

Unless, of course, you have /home/[a-d] on one disk, /home/[e-j] on a 
separate disk, etc.

Assuming /home is a single raid array comprised of multiple spinning 
disks (not SSD).

You could always test this by manually running a couple of rsync against 
various directories, storing the output on the backuppc server.

Other things that can have a big impact on this:
1) Check your backup server and backup client do not modify the atime 
when doing the backup (eg, mount with noatime if you can and it is 
relevant to your filesystem format, and won't cause any problems for 
your applications)

2) Check you have sufficient RAM (not swap) for the entire file list on 
both the backuppc server and client

3) Enable rsync checksum-seed so that backuppc does less work while 
checking which files need to backed up

Hope the above helps, it is all limited to my experience/knowledge, 
don't take it as gospel.

Regards,
Adam

------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the 
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>