Bacula-users

Re: [Bacula-users] Disk based backup using vchanger, volumes being marked as Error

2014-08-06 09:27:47
Subject: Re: [Bacula-users] Disk based backup using vchanger, volumes being marked as Error
From: Kern Sibbald <kern AT sibbald DOT com>
To: "Clark, Patricia A." <clarkpa AT ornl DOT gov>, "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Wed, 06 Aug 2014 15:22:32 +0200
Hello Patti,

Are you sure this is happening on 7.0.x?  I worked hard on this problem
between 5.2.x and 7.0.x, and I thought I had fixed the problem -- unless
there is some code path I missed.   It is possible that this is
happening only when there is no other drive that the 3rd job can use,
but if there are other drives available, Job 3 should select a different
volume.

Best regards,
Kern

On 08/06/2014 03:16 PM, Clark, Patricia A. wrote:
> Kern,
>
> This is exactly the problems that myself and others have been reporting with 
> autochanger and tape volumes.  Thank you Josh for the very descriptive 
> details.
>
> One additional issue that these race conditions can also create, if the 1st 2 
> jobs fill the volume that the 3rd job is waiting for, the 3rd job will fail 
> when it finds that it has mounted a read-only volume.
>
> Patti Clark
> Linux System Administrator
> R&D Systems Support Oak Ridge National Laboratory
>
> From: Kern Sibbald <kern AT sibbald DOT com<mailto:kern AT sibbald DOT com>>
> Date: Wednesday, August 6, 2014 at 1:52 AM
> To: Josh Fisher <jfisher AT pvct DOT com<mailto:jfisher AT pvct DOT com>>, 
> "bacula-users AT lists.sourceforge DOT net<mailto:bacula-users AT 
> lists.sourceforge DOT net>" <bacula-users AT lists.sourceforge DOT 
> net<mailto:bacula-users AT lists.sourceforge DOT net>>
> Subject: Re: [Bacula-users] Disk based backup using vchanger, volumes being 
> marked as Error
>
> On 08/04/2014 06:43 PM, Josh Fisher wrote:
>
>  ...
>
> Have you set PreferMountedVolumes=no in the Job resource in bacula-dir.conf? 
> If 3 jobs start and want to write to volumes in the same pool, then all three 
> can be assigned the same volume. In fact, if PreferMountedVolumes=yes, (the 
> default), then all three WILL be assigned the same volume unless the pool 
> restricts the max number of jobs that the volume may contain. However, your 
> device (drive) restricts the max concurrent jobs to 2. Therefore one of those 
> three jobs will not be able to select the drive where the volume is mounted 
> and will be forced to select another unused drive. That third job will 
> nevertheless select the same volume as the other two and attempt to move the 
> volume from the drive it is in into the drive that it has been assigned to. 
> The configuration has a built-in race condition.
> This is the first time that I have heard this explained so clearly.  I am 
> going to try to duplicate this problem now that you have so clearly explained 
> it.  By the way, I am not really sure I would classify this as a race 
> condition, because theoretically the SD is not blocked, the third job just 
> waits until the Volume is free (at least that is what I programmed).  
> However, this is clearly very inefficient.
>
> I would like to fix this, but one must keep in mind one important difficulty 
> with Bacula.  The SD knows what is going on with Volumes, but the Dir does 
> not, and it is the Dir that proposes Volumes to the SD.  Currently there is 
> no good atomic way to pass the information in the SD to the Dir so that it 
> can make better decisions.
>
> So, with the (current) restraint that the solution must involve changing only 
> the SD algorithm, how could one prevent this from happening?  I have some 
> ideas, but wonder what you think.
>
>
> Setting PreferMountedVolumes=no causes the three jobs to select a drive that 
> is NOT already mounted with a volume from the pool. This allows jobs writing 
> to the same pool to select different volumes from the pool, rather than all 
> selecting the same next available volume. This has its own caveats. It 
> doesn't necessarily prevent two jobs from selecting the same volume in some 
> cases, meaning that they will want to swap the volume back and forth between 
> drives, which is another type of race condition. I have used this method 
> successfully for a pool containing full backups only by setting 
> PreferMountedVolumes=no in the job resource and setting MaximumVolumeJobs=1 
> in the pool resource. Since Bacula selects the volume for a job in an atomic 
> manner, this forces an exclusive set of volumes for each job, thus preventing 
> the race condition. This means that concurrency is limited only by the number 
> of drives, but at the "expense" of creating a greater number of smaller 
> volume files. I quote "expense" because on a disk vchanger it isn't usually a 
> big issue to have more volume files. Doing this with a tape autochanger would 
> use a lot more tapes and be truly more expensive. Of course unlimited 
> concurrency is theoretical, since the hardware limits the USEFUL concurrency.
>
> I really do not like the PreferMountedVolumes = No option (I have probably 
> said this many times), but I find your use of it very well explained and very 
> interesting.
>
> Best regards,
> Kern
>
> ...
>
>



------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users