Bacula-users

Re: [Bacula-users] Starting again with my bacula config...

2014-06-09 04:42:47
Subject: Re: [Bacula-users] Starting again with my bacula config...
From: Kern Sibbald <kern AT sibbald DOT com>
To: Steven Haigh <netwiz AT crc.id DOT au>, "Bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Mon, 09 Jun 2014 10:38:41 +0200
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hello,

On 06/08/2014 12:09 PM, Steven Haigh wrote:
> I do believe this is one of the biggest shortcomings of Bacula... The
> fact it is job based vs file based removes a lot of flexibility.
This difference with TSM, is in fact one of the great accidental
inventions of Bacula, because in one extreme example, there is a large
European site running 12 instances of Tivoli.  These 12 instances are
needed because Tivoli requires so many resource to do the backup (*very*
big data volume but not a huge number of clients).  Bacula can do the
same thing with 1 instance of the Bacula director, because Bacula is
*far* more efficient in dealing with individual files than TSM is.

As a result, I don't see this design difference as a shortcoming but
rather a major advantage.

>
> If I understand things properly, for a VirtualFull will:
> 1) Require all volumes as stated below, and;
> 2) Require enough space to write the entire backup out again; and
> 3) is unable to keep a copy of a file forever if it is never changed.
For point 1) it should be noted that it requires all Volumes where any
file is uniquely stored.  That is if there is an Incremental JobId 10
that is part of the Incremental jobs, if all files that JobId 10 wrote
have been modified since then, that JobId will not be re-read, and thus
providing there are no other jobs the Volume will probably not be
needed.  Another way of stating it, is that Bacula will only need
volumes associated with the last time any given file was changed.
>
>
> Instead, after the purge date, the file is deleted and retransferred -
> unless it is done by a VirtualFull - which still has the problems of #1
> and #2 above.
I guess I could consider #1 a problem. However, #2 is certainly not a
disadvantage. It is *exactly* what is desired because we want to be able
to free up the old fragmented backup storage and thus the old volumes.
>
>
> As such, I'm not sure that I can easily achieve my goals with Bacula.
> I'm still not exactly sure as to what my other alternatives are as yet.
>
> I currently have an rsync going between hosts and create a copy of
> backups with hard links to minimise space used but still get a
> consistent view of each host (rotating daily). Maybe coupling this with
> a filesystem that supports compression would assist in making more space
> available for backups...
>
> As I've been used to TSM for so long (many years now!), I got used to
> how it works - and I'm having trouble moving on! :)

In general, Bacula can do everything that TSM can (the overall
architecture is very similar).  However as with every other backup
product, each has its own way of accomplishing certain tasks.  To
succeed you have to learn the Bacula way of doing things and not try to
force it to fit into any pre-concieved way of doing things.

>
>
>
> On 08/06/14 19:32, Kern Sibbald wrote:
>>
>> Hello,
>>
>> To do a VirtualFull you do need to have all backups since the last Full
>> or VirtualFull available.
>>
>> I recommend against production use of SQLite, unless you have less than
>> 10 machines.
>>
>> Normally there is no reason why an instance of MySQL/PostgreSQL cannot
>> be put in a VM that is running Bacula -- I do that all the time.
>>
>> Best regards,
>> Kern
>>
>> On 06/08/2014 11:04 AM, Steven Haigh wrote:
>>> Hi all,
>>
>>> The one thing I can see tripping me up is that from what I understand,
>>> for a VirtualFull I will need access to ALL jobs since the last
>>> VirtualFull. In the case of a removable eSATA drive that won't be online
>>> all the time, I can't guarantee that access will be available to that
>> drive.
>>
>>> At this stage, I have been using SQLite - simply to keep the entire
>>> system contained. I do have a MySQL server available - but the idea is
>>> to keep the backup system contained to a single VM.
>>
>>> The ponderings of which direction to go is difficult :)
>>
>>> On 08/06/14 18:31, Kern Sibbald wrote:
>>>>
>>>> Hello,
>>>>
>>>> I cannot help you with your overall design because I am more effective
>>>> writing new code than helping with implementations (very important).
>>>> However a couple of points, which are my personal opinions:
>>>>
>>>> 1. Your setup is medium size and would work fine with MySQL, but if you
>>>> can accept a short term learning curve and would like long term
peace of
>>>> mind, I would use PostgreSQL to avoid performance problems later. 
It is
>>>> harder to setup correctly and tune in the beginning (performance is
>>>> pretty bad with the out of the box PostgreSQL configuration, but
>>>> longterm it performs in big installations *much* better than MySQL.
>>>>
>>>> 2. You need to carefully setup incrementals forever, but Bacula has
>>>> supported that feature from the beginning and if you take the time to
>>>> understand and use Virtual Full jobs and accurate backups (at least
once
>>>> a week), Incrementals forever can be much more efficient compared to
>>>> normal backups.  If you don't use accurate mode (at least occasionally)
>>>> and VirtualFulls, stay away from incrementals forever.
>>>>
>>>> 3. I also recommend using the Bacula "virtual" autochanger for disk
>>>> based systems. It is very robust and simple, but there is not a lot of
>>>> documentation on it.
>>>>
>>>> Best regards,
>>>> Kern
>>>>
>>>> On 06/08/2014 05:00 AM, Steven Haigh wrote:
>>>>> Hi guys,
>>>>
>>>>> So I'm starting from scratch again with my bacula config. I
thought I'd
>>>>> try to get some pointers before I dive in head first again.
>>>>
>>>>> My setup consists of multiple virtual machines. Some over GigE, some
>>>>> over an ADSL connection (6000/800kbit). My aim is to transfer as
little
>>>>> as possible over the ADSL connection - but enough to be able to
restore
>>>>> if required.
>>>>
>>>>> I would like to use some local disk storage (say 40Gb), and have the
>>>>> rest go to a removable external eSATA drive. I'm thinking this could
>>>>> mainly be done via job migration when the internal storage starts
to get
>>>>> full.
>>>>
>>>>> As some insight, my current setup has ~167 daily incremental
backups and
>>>>> has used under 11Gb of space on the 'on disk' volumes. The amount of
>>>>> data changed per day isn't really huge.
>>>>
>>>>> Some more specific questions:
>>>>> 1) I want to try and avoid vchanger and use something that can use the
>>>>> eSATA drive properly - grow the number of volumes automatically to
fill
>>>>> the entire eSATA drive. Bonus points for being able to just plug in a
>>>>> new eSATA drive and expand further.
>>>>
>>>>> 2) From my previous posts, I heard that daily incrementals forever may
>>>>> be a bad idea - the whole job based backup vs the file based
backup that
>>>>> I'm used to with TSM. What would be the suggested route for backups
>>>>> being done? I obviously don't want to do Full backups over the ADSL
>>>>> connection every week / month.
>>>>
>>>>> 3) I'm starting from scratch with Bacula v7 and all systems are a
>>>>> mixture of RHEL6 and Fedora 20. Are there any gotchas I should be
aware
>>>>> of straight up?
>>>>
>>>>> 4) Any general comments? :)
>>>>
>>
>>
>>
>

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlOVcpEACgkQNgfoSvWqwEjGlwCfemhJwKSwcLHv/N7tWTQ51mKd
BCgAoLLKq+JneBNHISOMu4EKEemAx0oT
=ensI
-----END PGP SIGNATURE-----


------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://www.hpccsystems.com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users