BackupPC-users

Re: [BackupPC-users] Backing up many, many files over a medium speed link - kills CPU and fails halfway. Any help?

2009-11-24 21:23:29
Subject: Re: [BackupPC-users] Backing up many, many files over a medium speed link - kills CPU and fails halfway. Any help?
From: GB <pseudo AT gmail DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Tue, 24 Nov 2009 21:21:05 -0500
Hi,

Thanks for the reply. The data is, in fact, "all time" in the sense that it goes back years, but it's sorted by filename, rather than date; it's essentially equivalent to how BackupPC stores data in cpool/, i.e. the first 3 characters of the filename will generate 3 levels of subdirectories. The best I was able to do, to date, was to make 10 shares, 1-9, and back up 10 separate backup trees. But that was before, when I had about 100k files... I tried this recently, and seem to have made it go under. So I guess I'd need to make TWO levels of shares, so 1/0-1/9, 2/0-2/9, etc. Then, maybe, once I go through the full loop, it'll be easier to perform future incrementals since the delta will be small.

My BackupPC box doesn't swap too much, it doesn't behave like it's under massive load at all; but then again, I think my IO subsystem (Dell Perc6 + 4x WD Greens in RAID5) hopefully outperforms the speed of the link+any overhead :) I haven't tried stracing rsync on the remote server. Any suggestions on how to use it? I've never tried it before.

Thanks again.

-G

On Tue, Nov 24, 2009 at 8:05 PM, Chris Bennett <chris AT ceegeebee DOT com> wrote:
Hi there,

> Can anybody suggest some intelligent solutions for doing this? How can I
> speed up rsync / set it to use less CPU? I searched for some solutions, but
> everyone talks about limiting rsync _transfer_ speed, which is hardly the
> issue here - I'm backing up over a cable modem, heh.
>
> Any help appreciated. Thanks!

I havn't thought too much on your actual problem, but just in case you
hadn't considered it, can you change the backup source to suit your
backup system?  e.g. does your 370,000 files represent
days/weeks/years of information?  Can this be archived by month etc,
so the file count is massively reduced?

I see customers with web developers with little sysadmin skills and
they produce the most enormous trees of never-changing files spanning
days/months/years and backuppc needs to check these files everytime,
which hurts the overrall throughput.

Similarly, have you observed excessive swap in/out on your backuppc
server?  If you strace the rsync process on the remote system, does it
appear to be very 'bursty' in I/O activity, and overall throughput
just looks 'wrong'?

Regards,

Chris Bennett
cgb

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/