Bacula-users

Re: [Bacula-users] Device is BLOCKED - renamed Bugged or not?

2015-04-27 09:21:35
Subject: Re: [Bacula-users] Device is BLOCKED - renamed Bugged or not?
From: Josh Fisher <jfisher AT pvct DOT com>
To: Kern Sibbald <kern AT sibbald DOT com>, "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Mon, 27 Apr 2015 09:18:06 -0400
On 4/25/2015 1:50 AM, Kern Sibbald wrote:
> In my last email, I did forget to mention that as you point out, the
> problem can also result from a design issue.  And the resolution of
> those problems from design issues fall into my point 2.  If we have a
> good test case that shows the problem, even if it results from a design
> decision, most of the time we can find a solution -- in some cases, we
> have added new directives, but in most cases, a bit more
> programming/logic can fix the problem.
>
> One of the biggest issues that I have with the current SD algorithm is
> that during the drive(s) reservation process (prior to starting the SD
> job) once a write drive is assigned, it cannot be changed.  Changing a
> drive when multiple simultaneous jobs are writing is a non-trivial
> problem.  There are solutions, but they require rather profound changes
> to the SD, which I have been planning for at least 5 years -- all the
> underlying code and algorithms now exist so it is a matter of time.

Thank you Kern. That is good news!

Have you considered using a single device-volume pair assignment, rather 
than both a device assignment and a separate volume assignment? I have 
found that the easiest way to avoid thread-related issues is to minimize 
the number of things that must be serialized. Since a job, at any given 
instant, will always require both a device and a volume, it might make 
sense to assign both at the same time as a single atomic operation. The 
device-volume pair assignment code can be serialized by a single mutex, 
and I believe that would greatly simplify the device and volume 
assignment code, as well as allow for changing a job's device in a safe 
manner.  Any time that a job requires a volume to write on, whether at 
job start up or end of previous volume, it requests a device-volume pair 
to continue writing on. Since only one job at a time can enter the 
assignment code, both device and volume state are guaranteed to be 
static while checking device and volume criteria and making a 
device-volume pair selection and unloading / loading the device as 
needed. In turn, a successful request guarantees that the device-volume 
pair returned is valid for the job, and an unsuccessful request 
guarantees that the job needs to wait for an appendable volume. I 
believe that treating device and volume as a single unit would greatly 
simplify the assignment code. A single mutex for device-volume pairing 
should eliminate any chance of a race condition.



------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud 
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users