Bacula-users

[Bacula-users] efficient disk backups

2009-07-13 07:08:37
Subject: [Bacula-users] efficient disk backups
From: Gavin McCullagh <gavin.mccullagh AT gcd DOT ie>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 13 Jul 2009 12:04:45 +0100
Hi,

up until now, we've tended to keep backups in a fairly ad hoc manner.
People looking after a particular system have worked out their own way, be
it a proprietary backup tool, or a script of some sort.  We've started
setting up bacula and I hope we'll be in a position to backup nearly every
system with it, which will have substantial advantages.

For several of the larger systems, the script used is a standard enough
combination of rsync and hard links.  It's based on ideas used here:

        http://www.mikerubel.org/computers/rsync_snapshots/

As there is no one "full backup", you don't need to keep several full
backups, you basically just delete the tree of old backups you don't need.
A number of our servers tend to gradually accumulate files, most of which
then go unchanged (eg maildirs, video libraries, ...) so this backup method
tends to be very space efficient.

One server has about 300GB of data. We keep 31 consecutive days and the 1st
of each month prior to that.  This costs us about 450GB of disk space.  

I'm now looking at setting up bacula for this backup -- initially using
disk storage.  As a starting point, looking at chapter 25 of the manual, it
would cost about (300GB*6)*(compression ratio) just for the full backups
which is a little rough and probably involves a very large amount of
redundancy.  While SATA disks are pretty cheap, caddies for our Dell MD1000
disk array aren't :-(

To try and reduce the space requirements, I'm considering more spread out
schemes such as:

 - full backups on first sunday of the quarter to fullvol-[123]
   -> recycled after 6 months
 - differential backups on first sunday of the (other) months to diffvol-[1234]
   -> recycled after 3 months
 - incremental backups every other day to incvol-1
   -> recycled at end of each month

which I think should cost us more like (300GB*3+diffs+incs)*(comp_ratio).

Are there pitfalls in spreading things out this far?  We may move to tape
at some point (either spooling or migrate), but I don't have a budget to
buy LTO4 tape drives at the minute.  Is there some other technique I'm
missing that would more efficiently store these larger data stores?

Many thanks in advance,
Gavin


------------------------------------------------------------------------------
Enter the BlackBerry Developer Challenge  
This is your chance to win up to $100,000 in prizes! For a limited time, 
vendors submitting new applications to BlackBerry App World(TM) will have
the opportunity to enter the BlackBerry Developer Challenge. See full prize  
details at: http://p.sf.net/sfu/Challenge
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>