BackupPC-users

Re: [BackupPC-users] What file system do you use?

2013-12-17 18:28:41
Subject: Re: [BackupPC-users] What file system do you use?
From: Adam Goryachev <mailinglists AT websitemanagers.com DOT au>
To: backuppc-users AT lists.sourceforge DOT net
Date: Wed, 18 Dec 2013 10:26:51 +1100
On 18/12/13 03:35, Timothy J Massey wrote:
Russell R Poyner <rpoyner AT engr.wisc DOT edu> wrote on 12/17/2013 11:12:07 AM:

> This is a poor comparison since we have different data sets, but it
> would appear that BackupPC's internal dedupe and compression is
> comparable to, or only slightly worse than what zfs achieves. This in
> spite of the expectation that zfs block level dedupe might find more
> duplication than BackupPC's file level dedupe.

It all depends on the type of files you're backing up.

For my database and Exchange servers, I'd do bodily harm for block-level de-dupe.  Exchange is the *worst*:  I end up with huge (tens or hundreds of GB) monolithic files that are 99.9% identical to the previous day's backup.  BackupPC won't do me a bit of good on those files, but block-level dedupe would.

However, with "normal" files, file-level dedupe (like BackupPC) gives you a very high percentage of block-level.

I'm sure I've said this on-list before, but here it is again....

Whenever I need to backup large files, eg, disk images, or database export files, etc, I create a small script which:
1) Takes the large file as input
2) Uncompress the file if it is exported in a compressed format
3) Decide what "day" this is, sometimes I use day of week (0 - 6), or just a number that flips between 1 and 2, ranged from 1 to 10, or worst case just a single folder.
4) Split the large file into small chunks (around 20MB, depending on the overall file size, I might use 100MB chunks)
5) Confirm total size of the chunks is equal to the input file
6) Remove the input file (depending on the size, available space, and whether the filename is always the same or changes daily, etc).

Especially with disk images, but also SQL DB exports (both MySQL and MS SQL), most of the "chunks" are identical, allowing backuppc de-dupe to work for most "chunks". In addition, backuppc seems to handle small files with small changes much (significantly) quicker than large files with small changes.

Of course, there are simple tools to re-constitute the file on the remote server (eg, if you need to restore "last nights" backup, you don't even need to do that from backuppc, the files are still on the remote server, just need to join them back together.

Regards,
Adam

--
Adam Goryachev Website Managers www.websitemanagers.com.au
------------------------------------------------------------------------------
Rapidly troubleshoot problems before they affect your business. Most IT 
organizations don't have a clear picture of how application performance 
affects their revenue. With AppDynamics, you get 100% visibility into your 
Java,.NET, & PHP application. Start your 15-day FREE TRIAL of AppDynamics Pro!
http://pubads.g.doubleclick.net/gampad/clk?id=84349831&iu=/4140/ostg.clktrk
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/