Bacula-users

Re: [Bacula-users] Performance

2011-07-26 09:34:42
Subject: Re: [Bacula-users] Performance
From: Steve Ellis <ellis AT brouhaha DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 26 Jul 2011 06:18:25 -0700
On 7/26/2011 5:04 AM, Konstantin Khomoutov wrote:
> On Tue, 26 Jul 2011 00:18:05 -0700
> Steve Ellis<ellis AT brouhaha DOT com>  wrote:
>
> [...]
>> Another point, even with your current config, if you
>> aren't doing data spooling you are probably slowing things down
>> further, as well as wearing out both the tapes and heads on the drive
>> with lots of shoeshining.
> (I'm asking as a person having almost zero prior experience with tape
> drives for backup purposes.)
>
> Among other things, I'm doing full backups of a set of machines to a
> single tape--yes, full backup each time, no incremental/differential
> which means I supposedly have just straightforward data flows from
> FDs to the SD.  At present time I have max concurrent jobs set to 1
> on my tape drive resource and no data spooling turned on.
> Would I benefit from enabling data spooling in this scenario?
>
> To present some numbers, each machine's data is about 50-80G and I can
> use about 200G for the spool directory which means I could do spooling
> for 3-4 jobs in parallel (as described in [1]).
> Would that improve tape usage pattern?
>
> 1. http://www.bacula.org/en/dev-manual/main/main/Data_Spooling.html
>
>
OK, perhaps I'm not the best person to ask, but here's what I do know:

Even with only 1 job at a time, if you aren't able to deliver data to 
the drive at its minimum streaming data rate (for LTO4, probably at 
least 40MB/sec--possibly varies by manufacturer), then the tape 
mechanism will have to stop, go back a bit, wait for more data, then 
start up again--all of this takes time, and increases wear on the tapes 
and drive heads.  If you enable data spooling when you can't keep up 
with the drive anyway, even with a fairly modest spool size of 10-20G 
per job, I believe you will find that your backups will at least not be 
slower, and may well proceed faster, even with the overhead of spooling 
(assuming that your spool disk(s) are able to send data to the drive 
fast enough to hit near the maximum rate the drive can accept).  If you 
are using concurrent jobs, there is a further benefit:  the data for all 
jobs won't be completely shuffled on the tape.  If I recall, data 
spooling in bacula implicitly turns on attribute spooling, which can 
also help, I believe, if there are lots of small files in your backup.

You don't have to spool an entire job in order to take advantage of 
spooling--and with multiple concurrent jobs, while one is despooling 
others can be spooling (have to watch out for whether your spool area 
can keep up with all the writes and reads, though).

I'm still on LTO3, but I believe that some people advocate RAID0 for 
spool disks for LTO4.  I'm using an otherwise completely idle single 
drive for spooling 3 concurrent jobs and as far as I've noticed, I'm 
able to stream data to the drive at a rate it is happy with (again to LTO3).

I hope this helps,

-se

------------------------------------------------------------------------------
Magic Quadrant for Content-Aware Data Loss Prevention
Research study explores the data loss prevention market. Includes in-depth
analysis on the changes within the DLP market, and the criteria used to
evaluate the strengths and weaknesses of these DLP solutions.
http://www.accelacomm.com/jaw/sfnl/114/51385063/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>