In the message dated: Thu, 13 Oct 2011 11:54:47 +1100,
The pithy ruminations from "James Harper" on
<RE: [Bacula-users] seeking advice re. splitting up large backups -- dynamic
filesets to p
revent duplicate jobs and reduce backup time> were:
=> >
=> > In an effort to work around the fact that bacula kills long-running
=> jobs, I'm
=> > about to partition my backups into smaller sets. For example, instead
=> of
=> > backing up:
=> >
[SNIP!]
=>
=> Does Bacula really kill long running jobs? Or are you seeing the effect
Yes. Bacula kills long running jobs. See the recent thread entitled:
Full backup fails after a few days with "Fatal error: Network error
with FD during Backup: ERR=Interrupted system call
or see:
http://www.mail-archive.com/bacula-users AT lists.sourceforge DOT
net/msg20159.html
=> of something at layer 3 or below (eg TCP connections timing out in
=> firewalls)?
=>
=> I think your dynamic fileset idea would break Bacula's 'Accurate Backup'
=> code. If you are not using Accurate then it might work but it still
=> seems like a lot of trouble to go to to solve this problem.
=>
Yeah, it'll also break incremental and differential backups.
I'm not going ahead with this plan, at least not in the current form.
I really, really wish there was a way to prohibit multiple bacula jobs (of
different names & filesets) that access the same client.
I've logically split the multi-TB filesets into several bacula jobs. However,
this means that they will run "in parallel" and place a significant load on the
file & backup servers. I've created staggered schedules for full backups (ie.,
subset 1 is backed up on the 1st Wed of the month, subset 2 on the 2nd Wed,
etc), but this won't help as they are 'new' jobs, and bacula will promote the
initial incrementals to full backups.
=> If you limited the maximum jobs on the FD it would only run one at once,
That doesn't work, as we backup ~20 small machines in addition to the large (4
to 8TB) filesystems.
=> but if the link was broken it might fail all the jobs.
=>
=> Another option would be a "Run After" to start the next job. Only the
=> first job would be scheduled, and it would run the next job in turn.
=> Then they would all just run in series. You could even take it a step
=> further and have the "Run After" script to retry the same job if it
=> failed due to a connection problem, and to give up after so many
=> retries. Maybe it could even start pinging the FD to see if it was
=> reachable (if backing up over an unreliable link is the problem you are
=> trying to solve).
Not the problem. In fact, depending on the state of our HA cluster, the
'bacula' server may also be the 'file' server client.
Thanks,
Mark
=>
=> James
=>
------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|