Bacula-users

Re: [Bacula-users] Disk based backup using vchanger, volumes being marked as Error

2014-08-06 09:19:52
Subject: Re: [Bacula-users] Disk based backup using vchanger, volumes being marked as Error
From: "Clark, Patricia A." <clarkpa AT ornl DOT gov>
To: Kern Sibbald <kern AT sibbald DOT com>, "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Wed, 6 Aug 2014 13:16:44 +0000
Kern,

This is exactly the problems that myself and others have been reporting with 
autochanger and tape volumes.  Thank you Josh for the very descriptive details.

One additional issue that these race conditions can also create, if the 1st 2 
jobs fill the volume that the 3rd job is waiting for, the 3rd job will fail 
when it finds that it has mounted a read-only volume.

Patti Clark
Linux System Administrator
R&D Systems Support Oak Ridge National Laboratory

From: Kern Sibbald <kern AT sibbald DOT com<mailto:kern AT sibbald DOT com>>
Date: Wednesday, August 6, 2014 at 1:52 AM
To: Josh Fisher <jfisher AT pvct DOT com<mailto:jfisher AT pvct DOT com>>, 
"bacula-users AT lists.sourceforge DOT net<mailto:bacula-users AT 
lists.sourceforge DOT net>" <bacula-users AT lists.sourceforge DOT 
net<mailto:bacula-users AT lists.sourceforge DOT net>>
Subject: Re: [Bacula-users] Disk based backup using vchanger, volumes being 
marked as Error

On 08/04/2014 06:43 PM, Josh Fisher wrote:

 ...

Have you set PreferMountedVolumes=no in the Job resource in bacula-dir.conf? If 
3 jobs start and want to write to volumes in the same pool, then all three can 
be assigned the same volume. In fact, if PreferMountedVolumes=yes, (the 
default), then all three WILL be assigned the same volume unless the pool 
restricts the max number of jobs that the volume may contain. However, your 
device (drive) restricts the max concurrent jobs to 2. Therefore one of those 
three jobs will not be able to select the drive where the volume is mounted and 
will be forced to select another unused drive. That third job will nevertheless 
select the same volume as the other two and attempt to move the volume from the 
drive it is in into the drive that it has been assigned to. The configuration 
has a built-in race condition.
This is the first time that I have heard this explained so clearly.  I am going 
to try to duplicate this problem now that you have so clearly explained it.  By 
the way, I am not really sure I would classify this as a race condition, 
because theoretically the SD is not blocked, the third job just waits until the 
Volume is free (at least that is what I programmed).  However, this is clearly 
very inefficient.

I would like to fix this, but one must keep in mind one important difficulty 
with Bacula.  The SD knows what is going on with Volumes, but the Dir does not, 
and it is the Dir that proposes Volumes to the SD.  Currently there is no good 
atomic way to pass the information in the SD to the Dir so that it can make 
better decisions.

So, with the (current) restraint that the solution must involve changing only 
the SD algorithm, how could one prevent this from happening?  I have some 
ideas, but wonder what you think.


Setting PreferMountedVolumes=no causes the three jobs to select a drive that is 
NOT already mounted with a volume from the pool. This allows jobs writing to 
the same pool to select different volumes from the pool, rather than all 
selecting the same next available volume. This has its own caveats. It doesn't 
necessarily prevent two jobs from selecting the same volume in some cases, 
meaning that they will want to swap the volume back and forth between drives, 
which is another type of race condition. I have used this method successfully 
for a pool containing full backups only by setting PreferMountedVolumes=no in 
the job resource and setting MaximumVolumeJobs=1 in the pool resource. Since 
Bacula selects the volume for a job in an atomic manner, this forces an 
exclusive set of volumes for each job, thus preventing the race condition. This 
means that concurrency is limited only by the number of drives, but at the 
"expense" of creating a greater number of smaller volume files. I quote 
"expense" because on a disk vchanger it isn't usually a big issue to have more 
volume files. Doing this with a tape autochanger would use a lot more tapes and 
be truly more expensive. Of course unlimited concurrency is theoretical, since 
the hardware limits the USEFUL concurrency.

I really do not like the PreferMountedVolumes = No option (I have probably said 
this many times), but I find your use of it very well explained and very 
interesting.

Best regards,
Kern

...



------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users