Bacula-users

Re: [Bacula-users] Device

2011-06-29 17:08:48
Subject: Re: [Bacula-users] Device
From: Josh Fisher <jfisher AT pvct DOT com>
To: Mike Hobbs <mhobbs AT mtl.mit DOT edu>
Date: Wed, 29 Jun 2011 17:05:22 -0400
On 6/29/2011 11:38 AM, Mike Hobbs wrote:
> On 06/28/2011 04:38 PM, Josh Fisher wrote:
>> It isn't necessary. jbod1-drive-1, jbod1-drive-2, etc. are virtual
>> drives. The ArchiveDevice path in each virtual drive contains a symlink.
>> The symlink is created by vchanger to point to the folder containing the
>> volume file that is to be used. Any of the virtual drives may be
>> "loaded" with a volume from any of the "magazines" (ie physical drives).
>
> Ah, ok... so all these months I have been looking at this the wrong 
> way!  I didn't realize bacula was telling me what "virtual drive" it 
> was using, I thought it was telling me what drive bacula "thinks" it 
> was writing to!  Thank you for clearing that up for me!  So, if I 
> understand what you are saying, I  should leave the "Autochanger" 
> config in my bacula-sd file, and I can remove the 16 "Device" entries 
> in my SD file?

Well, not all of them. You need to define at least one, and you probably 
want several. It depends on the level of concurrency used, which depends 
on the maximum write throughput of the JBOD device, as well as the 
maximum throughput of the network connecting the clients, etc. In other 
words, how many jobs can you run concurrently before hitting the 
throughput bottleneck, whatever that bottleneck may be. It will require 
some testing to figure that out.

>> It is possible for all 16 of your virtual drives to be simultaneously
>> writing to different volume files on the same physical drive.
>
>> What this is telling you is that you are not running jobs
>> concurrently, so they are all using SD device jbod1-drive-1.
>
> I most definitely want to run jobs concurrently!  Once in production I 
> will be backing up a 200-300 machines!  I thought I had configured 
> bacula correctly for this as I am pretty sure I saw multiple jobs 
> running at the same time.. Maybe they were, but they were using the 
> same "virtual disk"?  How do I get bacula to write data using more 
> than one virtual drive?  I have all my config files set to 20 for 
> concurrent jobs, but evidently the jobs are all using the same VD and 
> not multiple VD's.  What am I missing?

If two concurrent jobs both select the same volume, they will both write 
to the same volume simultaneously, interleaving their data records. 
Since a volume can only be loaded in one drive, they will both 
necessarily use the same virtual drive. Interleaved volumes cause slower 
restores, since Bacula has to skip around to find the records for the 
job being restored. Though that is less of a problem on disk than on 
tape, I still prefer non-interleaved volumes. This is accomplished by 
setting MaximumConcurrentJobs=1 in each of the "Device" definitions in 
bacula-sd.conf. This can present a problem, in that if two jobs select 
the same volume, then they cannot run concurrently. 
MaximumConcurrentJobs=1 in the SD Device definition allows only one job 
at a time to be able to write to a particular volume.

By default, Bacula will select a volume that is already in a drive in 
preference to a volume not in a drive. For concurrent jobs writing to 
the same pool, this means they will always select the same volume. Thus 
if you set MaximumConcurrentJobs=1 in the SD Device, then it will not be 
possible to run concurrent jobs that write to the same pool, because the 
concurrent jobs will select the same volume, which can only be written 
to by one job at a time, forcing them to be serialized. To get around 
the default behavior, set PreferMountedVolumes=no in the Job definition 
of the jobs that will both run concurrently AND write to the same pool. 
This will cause the opposite behavior. Bacula will prefer selecting a 
volume that is NOT already in use in a drive, effectively meaning it 
will select a volume that is not already in use, loading it into another 
drive if necessary. This way, jobs writing to the same pool can run 
concurrently, each writing to a different volume, ensuring that volume 
data is not interleaved.


>
> Thank you for your help and a great explanation of what's going on!
>
> mike

------------------------------------------------------------------------------
All of the data generated in your IT infrastructure is seriously valuable.
Why? It contains a definitive record of application performance, security 
threats, fraudulent activity, and more. Splunk takes this data and makes 
sense of it. IT sense. And common sense.
http://p.sf.net/sfu/splunk-d2d-c2
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>