Bacula-users

Re: [Bacula-users] Concurrent Backups with a Virtual Autochanger

2014-11-15 04:25:32
Subject: Re: [Bacula-users] Concurrent Backups with a Virtual Autochanger
From: "Brady, Mike" <mike.brady AT devnull.net DOT nz>
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 15 Nov 2014 22:19:05 +1300
On 2014-11-15 20:45, Kern Sibbald wrote:
> Unfortunately, I no longer have the time to help people debug usage
> problems, which I actually enjoy doing, but the other users on this
> Bacula email list should be able to help you resolve this.  I do have a
> few remarks:
> 
Thanks for taking the time.

> 1. There is no reason to set the VolumePoolInterval to anything other
> than the default, especially for disk volumes.  Doing so could cause 
> the
> SD to consume a lot of unnecessary CPU time if there is an outstanding
> mount message.
> 
Understood.  It was in one of the white papers so I was trying t to see 
if it made any difference.

> 2. Forcing Bacula to use only one concurrent job per drive is not 
> making
> best use of your hardware.  In general, unless you have special needs
> (security, tight restore SLAs,...), it is *far* better from operational
> and performance standpoints to let Bacula write multiple simultaneous
> jobs on a single drive (perhaps 5-10).  This will permit the OS to
> reduce disk head movement.
> 
Understood.  This was for testing.  In production I was intending to 
start this out at 4-5 and going from there.

> Unfortunately neither of the above points will resolve the problem you
> are seeing.
But good to know none the less.

> 
> While you have provided some important input, what is missing is the 
> job
> log output including the mount message and a listing the volumes known
> to the catalog DB (list volumes).
> 
Ok.  I didn't think of sending those.  I still have the job log 
messages, but the catalog is no longer the same.

> To reassure you, although I have not explicitly tried or tested writing
> to multiple disk drives using some of the techniques recently discussed
> on this list, I am 100% sure that the Virtual Autochanger code does 
> work
> exceptionally well with a single disk drive as is your case, providing
> you really have labeled volumes that are available.
> 
Ok that is good to now.  I was wondering if I was attempting to do 
something strange.  It wouldn't be the first time. But it sounds like it 
is worth persevering with

Does this mean that automatic volume creation/labelling doesn't work in 
this type of setup, or does that come under "volumes that are available" 
provided Maximum Volumes has not bee reached?

> Note, by turning on a debug level of probably 100 to 150 in the
> Director, and possibly also in the SD, you should be able to "see" more
> clearly why the Director/SD cannot find any volumes that are ready to 
> use.
> 
I hadn't come across the debug stuff before.  I will rerun my tests 
tomorrow with debug on.

Thanks again.

Mike

> Best regards,
> Kern
> 
> 
> On 11/15/2014 12:17 AM, Brady, Mike wrote:
>> First of all thanks to Kern and Bacula Systems for making the "Best
>> Practices for Disk Based Backup" and "Disk Back Design" documents
>> available.
>> 
>> I have been playing around with the best way for doing concurrent
>> backups for a while and these documents have helped my understanding
>> considerably.  Using a Virtual Autochanger in particular seems an
>> elegant way of doing what I would like to do.
>> 
>> However, I am seeing some behaviour in my testing that I did not 
>> expect
>> and I need some input.
>> 
>> At a high level what I am trying to do is use a Virtual Autochanger to
>> write to multiple volumes in the same pool concurrently.
>> 
>> At the moment I have two devices limited to one concurrent job each.
>> Which, if I have understood things correctly, means that I should have
>> two jobs running concurrently writing to separate volumes.  The 
>> schedule
>> below kicks off eight jobs simultaneously with the number of devices
>> limiting concurrency.
>> 
>> This issue that I am having is that the first job gets FileChgr1-Dev1
>> and a volume as expected.
>> 
>> The second job gets device FileChgr1-Dev2 as expected, but always says
>> "Cannot find any appendable volumes." and issues a mount request.  
>> There
>> are multiple purged volumes with the recycle flag set available in the
>> IncPoool pool. Even if there weren't, the pool has Auto Labelling
>> configured and has not reached the MaximumVolumes limit, so there 
>> should
>> "always" be a volume available.
>> 
>> Other jobs continue to use the FileChgr1-Dev1 as it becomes available
>> while FileChgr1-Dev2 is waiting for a volume.
>> 
>> The second job eventually retries on FileChgr1-Dev2, gets an available
>> volume and successfully completes without any operator intervention.
>> 
>> After this the remaining jobs utilise both FileChgr1-Dev1 and
>> FileChgr1-Dev2 as they become available as I expected.
>> 
>> Is this behaviour expected (I am assuming some sort of race condition 
>> at
>> the start of the schedule with multiple jobs trying to get a volume at
>> the same time) or am I trying to do something fundamentally wrong 
>> here?
>> 
>> My configuration is:
>> 
>> Pool {
>>    Name = IncPool
>>    Pool Type = Backup
>>    Volume Use Duration = 23 hours
>>    Recycle = yes
>>    Action On Purge = Truncate
>>    Auto Prune = yes
>>    Maximum Volumes = 50
>>    Volume Retention = 2 weeks
>>    Storage = FileStorage01
>>    Next Pool = "IncPoolCopy"
>>    Label Format = "IncPool-"
>> }
>> 
>> Storage {
>>    Name = FileStorage01
>>    Address = 192.168.42.45
>>    SDPort = 9103
>>    Password = ***************************
>>    Device = FileChgr1
>>    Media Type = File01
>>    Maximum Concurrent Jobs = 10
>>    Autochanger = yes
>> }
>> 
>> Autochanger {
>>    Name = FileChgr1
>>    Device = FileChgr1-Dev1, FileChgr1-Dev2
>>    Changer Command = /dev/null # For  7.0.0 and newer releases.
>>    # Changer Command = "" # For 5.2 and older releases.
>>    Changer Device = /dev/null
>> }
>> 
>> Device {
>>    Name = FileChgr1-Dev1
>>    Drive Index = 0
>>    Media Type = File01
>>    Archive Device = /bacula_storage/FileDevice
>>    LabelMedia = yes;
>>    Random Access = Yes;
>>    AutomaticMount = yes;
>>    RemovableMedia = no;
>>    AlwaysOpen = no;
>>    Maximum Concurrent Jobs = 1
>>    VolumePollInterval = 5s
>>    Autochanger = yes
>> }
>> 
>> Device {
>>    Name = FileChgr1-Dev2
>>    Drive Index = 1
>>    MediaType = File01
>>    ArchiveDevice = /bacula_storage/FileDevice
>>    LabelMedia = yes;
>>    RandomAccess = Yes;
>>    AutomaticMount = yes;
>>    RemovableMedia = no;
>>    AlwaysOpen = no;
>>    MaximumConcurrent Jobs = 1
>>    VolumePollInterval = 5s
>>    Autochanger = yes
>> }
>> 
>> Schedule {
>>    Name = "DefaultBackupCycle"
>>    Run = Level=Full 1st sun at 00:10
>>    Run = Level=Differential 2nd-5th sun at 00:10
>>    Run = Level=Incremental mon-sat at 00:10
>> }
>> 
>> Thanks
>> 
>> Mike
>> 
>> ------------------------------------------------------------------------------
>> Comprehensive Server Monitoring with Site24x7.
>> Monitor 10 servers for $9/Month.
>> Get alerted through email, SMS, voice calls or mobile push 
>> notifications.
>> Take corrective actions from your mobile device.
>> http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users AT lists.sourceforge DOT net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>> 

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://pubads.g.doubleclick.net/gampad/clk?id=154624111&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users