On 4/25/2015 1:50 AM, Kern Sibbald wrote:
> In my last email, I did forget to mention that as you point out, the
> problem can also result from a design issue. And the resolution of
> those problems from design issues fall into my point 2. If we have a
> good test case that shows the problem, even if it results from a design
> decision, most of the time we can find a solution -- in some cases, we
> have added new directives, but in most cases, a bit more
> programming/logic can fix the problem.
>
> One of the biggest issues that I have with the current SD algorithm is
> that during the drive(s) reservation process (prior to starting the SD
> job) once a write drive is assigned, it cannot be changed. Changing a
> drive when multiple simultaneous jobs are writing is a non-trivial
> problem. There are solutions, but they require rather profound changes
> to the SD, which I have been planning for at least 5 years -- all the
> underlying code and algorithms now exist so it is a matter of time.
Thank you Kern. That is good news!
Have you considered using a single device-volume pair assignment, rather
than both a device assignment and a separate volume assignment? I have
found that the easiest way to avoid thread-related issues is to minimize
the number of things that must be serialized. Since a job, at any given
instant, will always require both a device and a volume, it might make
sense to assign both at the same time as a single atomic operation. The
device-volume pair assignment code can be serialized by a single mutex,
and I believe that would greatly simplify the device and volume
assignment code, as well as allow for changing a job's device in a safe
manner. Any time that a job requires a volume to write on, whether at
job start up or end of previous volume, it requests a device-volume pair
to continue writing on. Since only one job at a time can enter the
assignment code, both device and volume state are guaranteed to be
static while checking device and volume criteria and making a
device-volume pair selection and unloading / loading the device as
needed. In turn, a successful request guarantees that the device-volume
pair returned is valid for the job, and an unsuccessful request
guarantees that the job needs to wait for an appendable volume. I
believe that treating device and volume as a single unit would greatly
simplify the assignment code. A single mutex for device-volume pairing
should eliminate any chance of a race condition.
------------------------------------------------------------------------------
One dashboard for servers and applications across Physical-Virtual-Cloud
Widest out-of-the-box monitoring support with 50+ applications
Performance metrics, stats and reports that give you Actionable Insights
Deep dive visibility with transaction tracing using APM Insight.
http://ad.doubleclick.net/ddm/clk/290420510;117567292;y
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|