Bacula-users

[Bacula-users] Some Operational Questions: Backing up lots of stuff

2009-08-14 17:21:49
Subject: [Bacula-users] Some Operational Questions: Backing up lots of stuff
From: "K. M. Peterson" <kmp.lists+bacula-users AT gmail DOT com>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Fri, 14 Aug 2009 17:16:29 -0400
Hi everyone,

I have some strategy questions.

We've been using Bacula for about 18 months; backing up ~3.5TB/week to DLT-S4 (Quantum SuperLoader).  We are still on 2.2.8, but will be upgrading to 3.x this fall.

Thanks to everyone on this list, and the Bacula team for an excellent product. 

We have a Network Applicance filer (previously produced under their "StoreVault" division), with both CIFS and NFS exports.  Backing up NFS mounts on our backup server is kind of slow - 3-5MB/sec.  Not having the hardware to put in to mount and back up the CIFS shares, I found that we can get ~20MB/sec by constructing a pipe on the server using rsh and the NetApp dump command.   Of course, all we get into Bacula is a dump file, so we need to set up a similar arrangement with the restore command to restore things, but it's fine.  I want to start backing up the NFS-native trees this way.

However, backing up the whole thing as one backup job is problematic.  It takes a long time, it's opaque, and it's the 600 lb (272kg) gorilla in the backup workflow.  And a restore is going to be even more painful from a single backup job of the root of the device.

I should point out that I have scripts currently to run through a list of CIFS shares, set up the rsh jobs and pipes, and generate a report of what got backed up and when and how.  It's still one job, though, even though each share is a separate "file" in Bacula.  It's a problem because these jobs create snapshots when they are submitted, and so there are snapshots sitting around for the entirety of the job, and I'm never sure whether they are going to be cleaned up properly if the job gets canceled.  And if it does get canceled, I have to re-run everything again.  Painful.

This isn't the real question, though I'd love it if someone has something I haven't thought of.  The real question is a more general one: I need to figure out a way to dynamically create jobs.  I really want one job per filesystem - but what's the consensus of the best way to do this?  Should I just write job definitions and fileset definitions to a file that's included in my director's config, then issue a reload?  Is there an API that I've missed?  Is there something in 3.x that is going to make this better?  I want something that is as transparent as possible, and that can be set up so that when a new share/export gets created on the thing the backups get updated.  I can run something in cron, or RunBeforeJob, but it just seems wrong.   (By the way, it would be cool to have a plugin that would take the output from 'tar' or 'dump', and feed it to Bacula as if it were coming from an FD so Bacula would store the files and metadata... but I digress.)

I know I can dynamically define a fileset.  But, again, what I need is a more granular way to break down a large job.  I can figure out how to kludge it - and I've shown the current NetApp backup system to a few people who've considered I should get some therapy - but I'm at the point where I think I need to ask for directions.

We also have a few Windows servers that are in a different hemisphere.  I have the same kind of problem: I'd love to just backup "C:", but find we just can't keep the (session) up for long enough to get through it.  I know that we're going to have Accurate backups, which I presume might allow us to restart a job, but again over a long Internet link this is going to be problematic.

So, the question here is: is there a better way to plan for a likely inability to back up a large-ish filesystem in one job without resorting to having to enumerate all of the n level directories and break the task up into multiple jobs?  I started writing a script to scan a filesystem and emit the necessary directories to break the backup into a certain number of pieces, but of course we only control the top level, and users are going to want to add things that we'll need to back up incrementally.

Or, again, is there something I'm missing?

I'm happy to discuss things off-list if that would be easier.  Many thanks!

_KMP
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>