BackupPC-users

Re: [BackupPC-users] # of files per backup job

2008-04-16 00:54:20
Subject: Re: [BackupPC-users] # of files per backup job
From: dan <dandenson AT gmail DOT com>
To: "Jonathan Dill" <jonathan AT nerds DOT net>
Date: Tue, 15 Apr 2008 22:54:12 -0600
technically, you wont have any difference in backup speed by splitting up the job.  you might be running into a system resource issue on the server or client because rsync will eat up large chunks of RAM as file count increases(rule of thumb is 100Bytes per file) which may mean your system is pushing everything off to swap, making the server run very very slow.  another likely issue is that with many many files, IO on EITHER the server or the client could be a bit bottleneck.

I do a large rsync transfer over a T1 line nightly, I mirror my backuppc server remotely each night.  this is a pretty long process even though I only have about 1GB of new data each night.  The math says that can be transfered in about 1.5 hours but the initial file list takes 10 minutes to compute and IO performance is a bottleneck so the transfer takes about 2.5 hours, more or less depending on the amount of backup data.


On Tue, Apr 15, 2008 at 3:23 PM, Jonathan Dill <jonathan AT nerds DOT net> wrote:
On Apr 15, 2008, at 4:54 PM, Tim Hall wrote:

> Hi can anyone comment on back jobs with lots
> of files effecting transfer time?
>
> I have 2 big backup jobs which are taking too
> long over a WAN link.  Would it be advisable to
> break the jobs up into many smaller jobs with
> fewer files / job?  Would I be gaining anything?

How fast are the WAN links at both ends?  You may also want to look at
latency and any packet loss between the two endpoints.  If you are
backing up GB of data even a T1 is pretty slow, especially if you have
other data going across the same link, it could create an inadvertent
"DoS" if you peg the whole connection.

If you are using rsync, and especially if memory and / or  CPU is a
bottlneck, breaking it up into smaller jobs could help by reducing the
size of the "list" that rsync has to build and transfer.  Having a lot
of files will increase the size of the list.

What I have done in some cases is to just "mirror" what I want to
backup across the link to a "separate drive" at the remote end and
then use BackupPC at the remote end to make a "backup" of the mirror.
The "mirror" is just a copy of whatever is on the original, but
BackupPC provides the "history" of changes over time so you have more
than just last night's backup.

That also helps for initial set up since I can use e.g. an external
USB drive at the local site, do the initial mirror via USB rather then
over the WAN link, and then physically take the USB drive to the
remote (or maybe in your case ship the drive) and plug it into the
server there.  Optionally, substitute eSATA / SCSI / iSCSI / IEEE 1394
your connection type of choice for USB.  Backing up many GB over T1
will take several hours, especially if you limit bandwidth so that it
does not just peg your T1 and put everything else out of commission
until it finishes.

For e.g

20 GiB * 1024 MiB / GiB = 20480 MiB / 1.544 MiB/s = 13264 s / 3600 s/
hour = 3:41 hours

That would be for an optimal point-to-point T1 ignoring bandwidth lost
to framing bits, in practice will most likely take quite a bit longer
than that, especially if it is through VPN across the internet with
possible congestion at routers along the way.

>
>
> would 10 jobs with a 100,000 files each complete
> faster then 1 job with 1,000,000 files .... if they
> were backing up the same information
>
> thanks
>
>
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save
> $100.
> Use priority code J8TL2D2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki:    http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/


-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
Don't miss this year's exciting event. There's still time to save $100.
Use priority code J8TL2D2.
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
<Prev in Thread] Current Thread [Next in Thread>