Bacula-users

Re: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to prevent duplicate jobs and reduce backup time

2011-10-12 22:00:27
Subject: Re: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to prevent duplicate jobs and reduce backup time
From: mark.bergman AT uphs.upenn DOT edu
To: "James Harper" <james.harper AT bendigoit.com DOT au>
Date: Wed, 12 Oct 2011 21:58:28 -0400
In the message dated: Thu, 13 Oct 2011 11:54:47 +1100,
The pithy ruminations from "James Harper" on 
<RE: [Bacula-users] seeking advice re. splitting up large backups -- dynamic 
filesets to p
revent duplicate jobs and reduce backup time> were:
=> > 
=> > In an effort to work around the fact that bacula kills long-running
=> jobs, I'm
=> > about to partition my backups into smaller sets. For example, instead
=> of
=> > backing up:
=> > 

        [SNIP!]

=> 
=> Does Bacula really kill long running jobs? Or are you seeing the effect

Yes. Bacula kills long running jobs. See the recent thread entitled:

        Full backup fails after a few days with "Fatal error: Network error
with FD during Backup: ERR=Interrupted system call

or see:
        http://www.mail-archive.com/bacula-users AT lists.sourceforge DOT 
net/msg20159.html

=> of something at layer 3 or below (eg TCP connections timing out in
=> firewalls)?
=> 
=> I think your dynamic fileset idea would break Bacula's 'Accurate Backup'
=> code. If you are not using Accurate then it might work but it still
=> seems like a lot of trouble to go to to solve this problem.
=> 

Yeah, it'll also break incremental and differential backups.

I'm not going ahead with this plan, at least not in the current form.

I really, really wish there was a way to prohibit multiple bacula jobs (of
different names & filesets) that access the same client.

I've logically split the multi-TB filesets into several bacula jobs. However,
this means that they will run "in parallel" and place a significant load on the
file & backup servers. I've created staggered schedules for full backups (ie.,
subset 1 is backed up on the 1st Wed of the month, subset 2 on the 2nd Wed,
etc), but this won't help as they are 'new' jobs, and bacula will promote the
initial incrementals to full backups.

=> If you limited the maximum jobs on the FD it would only run one at once,

That doesn't work, as we backup ~20 small machines in addition to the large (4
to 8TB) filesystems.

=> but if the link was broken it might fail all the jobs.
=> 
=> Another option would be a "Run After" to start the next job. Only the
=> first job would be scheduled, and it would run the next job in turn.
=> Then they would all just run in series. You could even take it a step
=> further and have the "Run After" script to retry the same job if it
=> failed due to a connection problem, and to give up after so many
=> retries. Maybe it could even start pinging the FD to see if it was
=> reachable (if backing up over an unreliable link is the problem you are
=> trying to solve).

Not the problem. In fact, depending on the state of our HA cluster, the
'bacula' server may also be the 'file' server client.

Thanks,

Mark

=> 
=> James
=> 


------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users