Bacula-users

Re: [Bacula-users] Performance settings for large file LTO-6 backup

2015-06-27 01:42:45
Subject: Re: [Bacula-users] Performance settings for large file LTO-6 backup
From: Andrew Noonan <anoonan AT gmail DOT com>
To: "Bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Sat, 27 Jun 2015 00:37:27 -0500
On Fri, Jun 26, 2015 at 2:17 PM, Ana Emília M. Arruda
<emiliaarruda AT gmail DOT com> wrote:
> Hello Andrew,
>
> On Fri, Jun 19, 2015 at 5:10 PM, Andrew Noonan <anoonan AT gmail DOT com> 
> wrote:
>>
>> Hi all,
>>
>>      After wrestling with a Dell TL4000 in the thread marked "Dell
>> TL4000 labeling timeout", it looks like the autochanger is going to be
>> fine thanks to the efforts of several people, especially Ana, on this
>> list.
>
>
> Thank you :)
>
>>
>>      Moving forward, I'm about to start running jobs to at first
>> backfill a large dataset (about 250TB), and then do daily backups of
>> the dataset.  The dataset itself is tens of millions of small,
>> compressed files, so I don't particularly want to back the raw files
>> up in bacula, as the database would likely become quite unhappy with
>> me, so instead I've got a staging directory where I tar up a
>> time-sequence of the files, and then I'll use bacula to back up that
>> file, which is named with the time sequence contained inside.  These
>> tapes are to be archived offsite indefinitely.
>>
>
> Are you going to generate a .tar of about 250TB every day? Which will be the
> nature of your restores? You´re going to need always the restore of the
> whole data set or occasionally you will need to restore a small set of
> files?

After the backfill, I expect the daily backups to be in the hundreds
of gigs a day range.  During the backfill, I'll want to maximize
writes to catch up as soon as I can.  In general, we already have a
copy of the data, but given how it syncs, this doesn't protect against
an accidental "rm" getting synced downstream, so I'd imagine other
then for testing purposes, restores will be for accidental deletes or
for some sort of massive disaster situation.  This is why I figure the
backfill should be close to the size of the tape, though I'm not 100%.
>
>> My questions are this:
>>
>> 1)  For the backfill, should I shoot for creating single files about
>> 2.5TB in size to completely fill the tapes?
>
>
> If you occasionally need to restore small set of files, so this is a good
> idea. Not having a gigant .tar spanned into lots of tapes. This way you will
> need to restore the whole .tar first and then extract the files.
>
>>
>> 2)  If I make a tar larger then a tape's storage capacity (LTO-6),
>> will bacula automatically span tapes?
>
>
> Yes.
>
>>
>> 3)  Given the size of the tars, the serial nature of the backup, and
>> the dedicated nature of the autochanger (this is its only purpose),
>> are there any tuning parameters I can use to speed up the tape writes
>> given the giant few files nature of things?
>
>
> Yes, there are a few parameters that can be configured to speed up tape
> writes and backup jobs. You can take a look on the btape utility
> (http://www.bacula.org/7.0.x-manuals/en/utility/utility.pdf) for testing
> purposes of your autochanger configuration.
>
>
I looked through the document, are you talking about the btape "speed"
command?  Is that a 7.0 only command?  I'm running 5.2 right now.  As
an additional bit of information, I backed up the first chunk of data
(about 1.165T), which took a little over 5 hours.  Bacula had this at
a rate of 61MB/s, which seems a little slow.  IOWait on the system
never got above about 3-4%, and while I did see some spikes in system
time, I suspect that's from apps on the system.

>>
>> 4)  Is there anything about this that seems like a terrible, terrible
>> idea?
>>
>> Thanks,
>> Andrew
>
>
> You´re welcome. Regards,
> Ana
>
>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users AT lists.sourceforge DOT net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>

------------------------------------------------------------------------------
Monitor 25 network devices or servers for free with OpManager!
OpManager is web-based network management software that monitors 
network devices and physical & virtual servers, alerts via email & sms 
for fault. Monitor 25 devices for free with no restriction. Download now
http://ad.doubleclick.net/ddm/clk/292181274;119417398;o
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users