Bacula-users

Re: [Bacula-users] trying to recreate a retrospect (I know...) backup strategy

2010-02-19 18:38:25
Subject: Re: [Bacula-users] trying to recreate a retrospect (I know...) backup strategy
From: shouldbe q931 <shouldbeq931 AT googlemail DOT com>
To: Arno Lehmann <al AT its-lehmann DOT de>
Date: Fri, 19 Feb 2010 23:35:44 +0000
As always, your're very helpfull :-)


On Fri, Feb 19, 2010 at 10:09 PM, Arno Lehmann <al AT its-lehmann DOT de> wrote:
> Hello,
>
> 19.02.2010 22:09, shouldbe q931 wrote:
>> Hi All,
>>
>> The Retrospect backup had two sets of "incremental archive" tapes,
>> that ran for a year, set A in the 1st week, set B in the 2nd then set
>> A in the 3rd week etc. Independently to this, at weekends there were a
>> different set of tapes for doing full backups that rotated each week.
>
> So far quite straightforward. I still don't see what advantage you
> have to change tape sets on a weekly basis, but if you like to, why not...

The full backups each week are for to provide for a "quick restore" of
the current state of the sever, which would be at worst 1 week old,
but as we now have a daily mirror to disk with 2 weeks of
incrementals, we could forget about them if we were able to move the
mirror offsite (not likely to happen for a few months)

The two sets of  "archive incrementals" is to allow for one set to be
stored offsite, and the other to be used for retrives

    If you think of it as _very_ basic HSM, if a set of files is
deleted, they can always be recovered if they were on the server for
at least 24 hours or 7 days depending on which set. We looked at using
a new tape each week, but it would get quite expensive on LTO3 tapes,
and storage space for them...

Having two independent sets allows for 1 set to be lost, and have not
lost all of the data that was on Tape.

>
>> I'm struggling with how to recreate this in Bacula.
>
> I suggest you should look at it from a higher level point of view:
> What do you want to achieve, and how can you most easily achieve it?
> One of the most important things in a backup setup, in my opinion, is
> to keep it as simple as possible.
>
>> My sticking points
>> are how to not "invalidate" the incremental sets with each other, and
>> how if a restore is needed from the incremental archive, it would only
>> search through the incremental archive, and not point to one of the
>> full backups,
>
> The latter idea can't work - an incremental only backs up changes
> against the previous backups, and you'll always end up with the latest
> full backup in the chain. Otherwise they wouldn't be incrementals.
>
Retrospect allowed this to happen by having a catalog for each set.
Each Monday incremental contained all of the changes from the previous
week.

>>  would I be able to purge or prine each tape before it
>> was re-used for the full backup, or would I also have to "purge files
>> from job" of the previous job they were used for ?
>
> Why would you want to intentionally make data inaccessible?

The idea of pruning/purging those records is to prevent Bacula from
attempting to restore a file from them after they have been re-used to
recover a file that would be on the "incremental archive", and to
reduce the size of the database (they want a years worth of file),
then the next year start a new database, as an aside, the retrospect
catalog couldn't quite make it for 12 months before it hit a 2gb hard
limit, so they moved to doing 6 month cycles. The file server has ~3TB
of data (quite a lot of big photoshop files) across ~350,000 files

>
>> I was thinking that I needed three catalogs, but as the catalog is per
>> client, that wouldn't work. Would having three differently named, but
>> identically configured file sets work ?
>
> I really don't know why you'd need more than one catalog for this.
>
>> Or am I going to need to run three instances of the file daemon to get
>> three catalogs, and if so, I'd really appreciate somebody pointing me
>> in the direction of  how to do this.
>
> God no... stop! ;-)
>
> I'm just guessing what you really want, but this is my idea:
>
> Create pools EvenFull, OddFull, EvenIncr, OddIncr. Set retention times
> etc. as you want them. Use time should be six days.
>
> Schedule jobs like this:
>
> run level=full pool=EvenFull w01, w03, w05, w07, ..., w53 Sun at 22:00
> run level=incremental pool=EvenIncr w00, w02, w04, ..., w52 Mon-Fri at
> 22:00
> run level=full pool=OddFull w00, w02, w04, ..., w52 Sun at 22:00
> run level=incremental pool=OddIncr w01, w03, w05, ..., w53 Mon-Fri at
> 22:00
>
> If you can explain why I use the EvenFull pool on Sundays of the odd
> weeks you know what I'm aiming at :-)
>
I think I follow your schedule and use of pools

Mon (Einc) Tue (Einc) Wed (Einc) Thur (Einc) Fri (Einc) Sun (Ofull)
Mon (Oinc) Tue (Oinc) Wed (Oinc) Thur (Oinc) Fri (Oinc) Sun (Efull)
etc

Einc and Oinc would have a retention period of $infinite, Efull and
Ofull would have a retention period of 2 weeks

However as I understand it, this would effectively be a single tape
pool, that between them covered all of the year, but any data would
only ever be on one tape, there would be no duplication, and its the
duplication that they are after. They are after the duplication from
previously using DDS, AIT and SAIT with considerably lower reliability
than DLT or LTO. The new library is an LTO3, and I would hope that
they won't suffer anything like the same quantity of failed tapes, but
once bitten twice shy...

As an alternative, could I run/spool incrementals to disk, and then
copy/despool/merge them to two tape pools ? That way they get their
data on two sets of tapes.


> So, more abstract answer:
> Don't play with catalogs, multiple FD instances, forced pruning or
> purging - use pools and schedules. Fine-tuning the pool settings can
> be challenging if your jobs tend to run very long, but that's only a
> technical problem...
>
> Cheers,
>
> Arno
>
>
>> Many thanks
>>
>> Arne
>>
>> ------------------------------------------------------------------------------
>> Download Intel&#174; Parallel Studio Eval
>> Try the new software tools for yourself. Speed compiling, find bugs
>> proactively, and fine-tune applications for parallel performance.
>> See why Intel Parallel Studio got high marks during beta.
>> http://p.sf.net/sfu/intel-sw-dev
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users AT lists.sourceforge DOT net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>
> --
> Arno Lehmann
> IT-Service Lehmann
> Sandstr. 6, 49080 Osnabrück
> www.its-lehmann.de
>
> ------------------------------------------------------------------------------
> Download Intel&#174; Parallel Studio Eval
> Try the new software tools for yourself. Speed compiling, find bugs
> proactively, and fine-tune applications for parallel performance.
> See why Intel Parallel Studio got high marks during beta.
> http://p.sf.net/sfu/intel-sw-dev
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users