Bacula-users

Re: [Bacula-users] Large backup to tape?

2012-03-08 15:34:49
Subject: Re: [Bacula-users] Large backup to tape?
From: Steve Ellis <ellis AT brouhaha DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 08 Mar 2012 12:19:38 -0800
On 3/8/12 9:38 AM, Erich Weiler wrote:
> Thanks for the suggestions!
>
> We have a couple more questions that I hope have easy answers.  So, it's
> been strongly suggested by several folks now that we back up our 200TB
> of data in smaller chunks.  This is our structure:
>
> We have our 200TB in one directory.  From there we have about 10,000
> subdirectories that each have two files in it, ranging in size between
> 50GB and 300GB (an estimate).  All of those 10,000 directories adds to
> up about 200TB.  It will grow to 3 or so petabytes in size over the next
> few years.
>
> Does anyone have an idea of how to break that up logically within
> bacula, such that we could just do a bunch of smaller "Full" backups of
> smaller chunks of the data?  The data will never change, and will just
> be added to.  As in, we will be adding more subdirectories with 2 files
> in them to the main directory, but will never delete or change any of
> the old data.
>
> Is there a way to tell bacula to "back up all this, but do it in small
> 6TB chunks" or something?  So we would avoid the massive 200TB single
> backup job + hundreds of (eventual) small incrementals?  Or some other idea?
>
> Thanks again for all the feedback!  Please "reply-all" to this email
> when replying.
>
> -erich
Assuming the subdirectory names are somewhat reasonably spread through 
the alpha space, can you do something like:
FileSet {
     Name = "A"
     Include {
         File = /pathname/to/backup
         Options {
             Wild="[Aa]*"
         }
     }
}
...
FileSet {
     Name = "Z"
     Include {
         File = /pathname/to/backup
         Options {
             Wild="[Zz]*"
         }
     }
}

Then, specify separate Jobs for each FileSet.  To break things up more 
you might need to break up on second or later characters rather than the 
first one, and you'd need to include FileSets as well any directories 
starting with non-alpha characters.  Certainly this could be somewhat 
annoying to make sure you are covering all of your directories, 
especially if the namespace is populated very lopsidedly, but I believe 
it would work.  Note that I have not tried this approach, but it does 
seem feasible.

I hope you are using a filesystem that behaves well with so many 
subdirectories from one parent (for example, ext3 without dir_index 
would likely do somewhat poorly).

-se




------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users