BackupPC-users

Re: [BackupPC-users] Backup over slow link: endless loop or just missing patience?

2014-09-29 19:56:20
Subject: Re: [BackupPC-users] Backup over slow link: endless loop or just missing patience?
From: Holger Parplies <wbppc AT parplies DOT de>
To: "Dr. Boris Neubert" <omega AT online DOT de>
Date: Tue, 30 Sep 2014 01:54:21 +0200
Hi,

Dr. Boris Neubert wrote on 2014-09-29 20:16:56 +0200 [[BackupPC-users] Backup 
over slow link: endless loop or just missing patience?]:
> [...]
> The uplink is a slow DSL line with a maximum of 640 kBit/s towards the
> BackupPC server. BackupPC server is connected to NAS via openVPN.
> Transfer method is rsyncd.
> [...]
> Next data set added was my home directory with 15 GB.

at the maximum theoretical throughput that would make 54.6 hours. I wouldn't
expect more than half the throughput in real world conditions, so that would
be more like 110 hours. More even if the incrementals of your other backups
are competing for bandwidth.

If you say your 1 GB backup took "a few days", I'd expect 15 GB to take 15
times "a few days" under similar circumstances.

I would advise against raising ClientTimeout to make it try to complete in one
run.

> [...]
> What bothers me are the fatal errors (signal= ALRM) in combination with
> the last lines of the "Last bad XferLOG" [...]
>
> Not saving this as a partial backup since it has fewer files than the prior 
> one (got 27503 and 27503 files versus 27628)

I would normally *expect* BackupPC to use the previous partial as a reference,
meaning it should check but not transfer the saved files and then continue
with the rest. That should give you gradual progress, until finally the 15 GB
complete. But that does not seem to be happening for you. An obvious cause
would be if there were one single large file that does not complete within the
timeout window, though then I'd expect the numbers to match ("27503 files
versus 27503"). Another would be a large amount of changes in the first part -
having saved the data yesterday does not help if it needs to be retransferred
today because it has changed. A combination of both might be what you are
seeing (similar but not equal numbers).

> Does this mean that I am actually stuck in an endless loop because there
> is a certain point in time or amount of data beyond which the backup
> does not go?

Apparently. It should be noted that partial backups are - as far as I
understand it - meant to speed up a second attempt at a full rsync(d) backup
if the first failed partway through (power failure, link failure, ...), *not*
for piecing together "huge" data sets. Note that there is a setting
$Conf{PartialAgeMax} which will limit the use of a partial.

The first thing to do is to make sure that your plan is feasible at all. If
the amount of daily changes exceeds the bandwidth available for backup, then
that's not the case. "A few days" for 1 GB doesn't sound promising!

You have probably done the maths and concluded that it should work.

I would try any of the following:

1.) Limit the size of the backup to something that will complete within the
    alarm timeout. I'd start with a *small* number - perhaps only 100 MB at
    first (or even less), to get a feeling for how long things really take.
    You do that with (additional) excludes. Once you have that backup, add
    more to your data set by gradually removing excludes. Do it in small
    enough steps that your backups will succeed (without generating partials!).
    Run only full backups (or alternating full and incremental). Two
    consequential incrementals would let the second one re-transfer the data
    the first one has covered (unless you are using IncrLevels, but let's not
    complicate things). If your backup completes after an hour, you can always
    start the next one manually rather than waiting for the scheduler.

    That's probably a slow and cumbersome process, but you seem to be
    stretching the limits of what is possible with your internet connection.

    You *have* set up meaningful excludes, I suppose? There's a lot of
    temporary data in a home directory that doesn't warrant backing up
    (.thumbnails, browser cache directories, mailer IMAP caches, ...) and,
    to make things worse, that tends to change frequently. Finding and
    excluding all of this stuff is a somewhat surgical operation, but with
    your bandwidth constraints I can't see a way around it. A [somewhat
    outdated] excerpt from my server excludes:

        '/*/.thumbnails',
        '/*/.evolution/mail/imap/user@host*',
        '/*/.evolution/cache',
        '/*/evolution/mail/imap/user@host*',
        '/*/.galeon/mozilla/galeon/Cache',
        '/*/.mozilla/*/*/Cache',
        '/*/.gnome2/epiphany/mozilla/epiphany/Cache',
        '/*/.adobe/Acrobat/*/Cache',
        '/*/.nautilus/thumbnails',
        '/*/.Trash',
        '/*/.googleearth/Cache'

    (my share is /home, so you'd probably need to remove the leading '/*').
    You get the picture? There are a lot of caches around, partly hidden
    in unexpected places. Use a program that graphically displays your disk
    usage (something like 'filelight' perhaps) to find them.

2.) Create a copy of your data set on the VM with rsync manually. That gives
    you far more control over what is happening, and you might be able to
    use compression (-z option). Then import that backup with the ClientAlias
    trick occasionally discussed here (you'd temporarily need rsyncd on the
    BackupPC server). 

3.) Find a neighbour with a faster internet connection who is willing to let
    you use it for some time (presuming you can establish a WLAN or cable
    connection to his network). VDSL can easily have 10 MBit/s uplink. A
    speed increase by a factor of 16 would really help you :-).
    You might even think of upgrading your own DSL connection, though that
    could take time and is not possible in all circumstances.

4.) Take your NAS (or a copy of your data and your OpenVPN key) to a place
    with more bandwidth for the initial backup.

That's about all I can think of right now.

Hope some of that helps.

Regards,
Holger

------------------------------------------------------------------------------
Slashdot TV.  Videos for Nerds.  Stuff that Matters.
http://pubads.g.doubleclick.net/gampad/clk?id=160591471&iu=/4140/ostg.clktrk
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/