BackupPC-users

Re: [BackupPC-users] Correct rsync parameters for doing incremental transfers of large image-files

2012-05-14 20:28:31
Subject: Re: [BackupPC-users] Correct rsync parameters for doing incremental transfers of large image-files
From: Adam Goryachev <mailinglists AT websitemanagers.com DOT au>
To: backuppc-users AT lists.sourceforge DOT net
Date: Tue, 15 May 2012 10:27:16 +1000
On 15/05/12 09:34, Andreas Piening wrote:
> There are already two USB-disks that are swapped every few days and I
> copy nightly images on the currently connected drive. Some weeks ago,
> the office had a water-pipe break. The water has been stopped early
> enough so there was no damage, but my customer assigned me to create a
> external backup solution. He wants me to be able to completely restore
> the system including the virtual machines if someone breaks into the
> office and steals the hardware ore something like that. If this
> happens, I need at least one day to buy new hardware. But the point is
> that I need to be able to restore the system afterwards to a working
> state just like it was one day before the disaster happens.
>
> As I understand the rsync functionality the algorithm is able to do
> in-file incremental updates. My problem is just that I can't figure
> out what prevents it from working in my case...
>

Currently, I handle the situation as follows:
1) File level backup with backuppc, this covers the
non-disaster/disaster scenario where somebody inadvertently trashes your
database, or deletes a "important" file, etc... need to restore to
yesterday, or last week, etc but only a small number of files, not all
of them.
Some applications require pre-processing to be done, eg, MS SQL, so I
use custom scripts that will export those files using whatever "backup
procedure" is required for the proprietary application. Then depending
on the output:
a) If it is compressed, especially compressed "ascii format files", then
I will de-compress them (and continue with (b)
b) If it is a large file (over 100M) then I will split the file into
smaller chunks of about 20M to 50M each (just based on the quantity)
c) I will keep multiple folders (1 for each work day (5), or 1 for each
day of the week (7)), and let backuppc backup all these folders. This
helps ensure that every version of the daily backup makes it into
backuppc, and also allows for backuppc to run during the local
backup/split process, because it will always get a complete backup copy
within the past 24 hours
d) I then use Xymon monitoring to ensure that these local backup files
have been modified within the past 26 hours (to alert me if the local
pre-processing script has failed/etc). No point of backups if the files
are never updated.


2) Disk level backups
a) shutdown the virtual machine (for VM's)
b) split the VM disk into chunks in another folder (local)
c) copy the chunks to a remote machine with rsync
d) the remote machine examines the chunks to ensure it has a full up to
date copy, and then re-assembles the chunks into a disk image,
preserving the previous disk image as .old

BTW, the reason to split large files into smaller chunks are:
1) if a chunk has no changes, backuppc gets to take advantage of pooling
2) backuppc doesn't handle large files with small changes very well
(performance wise in my experience), it seems to handle a totally new
file, or small files with changes perfectly.
3) If the rsync fails, it can be re-started, completed chunks are
compared and skipped much more quickly (you could use partial)
4) You don't need to wait for the remote rsync to make a temp copy of
the large file before starting to compare/transfer the file (you could
use inplace)

Some rsync options are available (inplace and partial) but I feel the
chunk method is still more efficient from a time, bandwidth and disk I/O
points of view.

I originally used LVM2 snapshots to backup the live VM's, but had two
issues with this:
1) Potentially the image I've backed up is corrupted since it is "point
in time" and therefore not "cleanly unmounted"
2) I had dreadful disk I/O performance issues (the VM, snapshot space,
and destination temp copy were ALL stored on the same RAID10 array, this
might have been better if a different disk was used for the destination,
but either way, I found the customer was happy to shutdown the machine
for a backup window each night.

I hope some of the above helps...

Regards,
Adam

-- 
Adam Goryachev
Website Managers
www.websitemanagers.com.au


------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>