Bacula-users

Re: [Bacula-users] Large backup to tape?

2012-03-08 12:40:34
Subject: Re: [Bacula-users] Large backup to tape?
From: Erich Weiler <weiler AT soe.ucsc DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 08 Mar 2012 09:38:33 -0800
Thanks for the suggestions!

We have a couple more questions that I hope have easy answers.  So, it's 
been strongly suggested by several folks now that we back up our 200TB 
of data in smaller chunks.  This is our structure:

We have our 200TB in one directory.  From there we have about 10,000 
subdirectories that each have two files in it, ranging in size between 
50GB and 300GB (an estimate).  All of those 10,000 directories adds to 
up about 200TB.  It will grow to 3 or so petabytes in size over the next 
few years.

Does anyone have an idea of how to break that up logically within 
bacula, such that we could just do a bunch of smaller "Full" backups of 
smaller chunks of the data?  The data will never change, and will just 
be added to.  As in, we will be adding more subdirectories with 2 files 
in them to the main directory, but will never delete or change any of 
the old data.

Is there a way to tell bacula to "back up all this, but do it in small 
6TB chunks" or something?  So we would avoid the massive 200TB single 
backup job + hundreds of (eventual) small incrementals?  Or some other idea?

Thanks again for all the feedback!  Please "reply-all" to this email 
when replying.

-erich

On 3/1/12 10:18 AM, mark.bergman AT uphs.upenn DOT edu wrote:
> In the message dated: Wed, 29 Feb 2012 20:23:14 PST,
> The pithy ruminations from Erich Weiler on
> <[Bacula-users] Large backup to tape?>  were:
> =>  Hey Y'all,
> =>
> =>  So I have a Dell ML6010 tape library that holds 41 LTO-5 tapes, all
>
> I've got a Dell ML6010, so I can offer some specific suggestions.
>
>       [SNIP!]
>
> =>
> =>  The fileset I'm backing up is about 200TB large total (each file is
> =>  about 300GB big).  So, not only will it use every tape in the tape
> =>  library (41 tapes), but we'll have to refill the tape library about 6
> =>  times to get the whole thing backed up.  After that I want to just do
>
> I agree with the other suggestions to break up the dataset into smaller
> chunks.
>
>
>       [SNIP!]
>
> =>
> =>  So, I guess a have a couple basic questions.  When it uses all the tapes
> =>  in the library in a single job (200TB! 41 tapes only hold 60TB), will it
>
> It'll depend a lot on the compressibility of your data.
>
> =>  simply pause, send me an email saying it's waiting for new media, then I
> =>  load 41 new tapes?  Then tell it to resume, and it uses the next 41, ad
> =>  nauseum?
>
> Yes, sort of.
>
> You'll get lots of mail from bacula about needing to change tapes.
>
> In my experience, changing tapes in the library while a backup is running must
> be done very carefully. I suggest that you not use the native ML6010 tools
> (touch pad on the library or web interface) to move tapes to-and-from the
> mailbox slots. Our procedure is:
>
>       use mtx to transfer full tapes from library slots to the mailbox slots
>
>       remove the full tapes from the mailbox slots
>
>       add new tapes to the mailbox slots
>
>       allow the library to scan the new tapes, the choose to add them
>       to "partition 1" (or whatever you have named your non-system partition
>       within the library)
>
>       use mtx to transfer the new tapes from the mailbox slots to available
>       slots in the library
>
>       when complete, run "update slots" from within the Bacula 'bconsole'
>
>       if the tapes have never been used within Bacula before, run "label
>       barcodes" from within 'bconsole'
>       
> =>
> =>  And, if I want to make 2 copies of the tapes, can I simply configure 2
> =>  differently named jobs that each backup the same fileset?
> =>
> =>  Also, do I need to manually "label" the tapes (electronically) as I load
> =>  them, or will the fact that the autoloader automatically reads the new
> =>  barcodes be enough?
>
> You will need to logically label the tapes (writing a Bacula header to each
> tape). This can be done automatically with "label barcodes".
> =>
> =>  Thanks for any hints.  And, if you know any "gotchas" I should watch for
> =>  during this process, please let me know!  I don't want bacula expiring
> =>  the tapes ever, or re-using them, as the data will never change and we
> =>  need to keep it forever.
>
> Set the file/volume/job retention times to something really long. For us, "10
> years" =~ "infinite", under the theory that after 10 years we'll have moved to
> different tape hardware and the old data will need to be transferred to the
> new media somehow.
>
> Make a backup of the Bacula database as soon as the backup is complete. Save
> that to both a backup tape and to some other media (external hard drive?
> multiple Blueray discs? punch cards?) so that you can recover data if there's
> ever a problem with the database--you do NOT want to be in a position of
> needing to "bscan" ~100x LTO5 tapes in order to rebuild the database.
>
> Mark
>
> =>
> =>  Many thanks,
> =>  erich
> =>

------------------------------------------------------------------------------
Virtualization & Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users