Bacula-users

Re: [Bacula-users] Tape changer with simultaneous jobs, using (but not actually using) a tape

2010-03-31 15:34:34
Subject: Re: [Bacula-users] Tape changer with simultaneous jobs, using (but not actually using) a tape
From: Craig Miskell <craig.miskell AT opus.co DOT nz>
To: bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Thu, 01 Apr 2010 08:31:21 +1300
Martin Simmons wrote:
>>>> I'm happy for it to interleave, but I'm curious as to why it touches that 
>>>> first tape (7) but then decides not to use it. 
>>>>    It's effectively recycling/nuking a backup that doesn't need to be, 
>>>> requiring two tapes where one will do.   Anyone 
>>>> got any idea what's going on?  Happy to file a bug if it is one, but don't 
>>>> want to false report.
>>> I suggest trying it in bacula 5.0.1 first.
>> Is that because you have some concrete knowledge of a fix that is in 5.0.1, 
>> or just a blind "upgrade to the latest" 
>> suggestion?  It's a non-trivial amount of work for something that might just 
>> be speculation.
> 
> A bit of both.
> 
> There was some concurrency issue like this, but I can't remember which version
> it was.
Right, glad it's got some basis.  I'm generally of the opinion (based loosely 
on experience) that if you call some 
vendors tech support and report that there's smoke coming from the server, 
they'll ask you to upgrade to the latest 
firmware to see if that resolves the issue.  That mind-set unfortunately 
carries over to other scenarios :)

>> I'm going to try to work around this by setting Maximum Concurrent Jobs = 1 
>> on each tape drive in the changer, and see 
>> what works.  If I avoid the interleaving, I should avoid the unnecessary 
>> recycling.   It's also possible that I'd get 
>> some results by starting one of the jobs 15 minutes later than the other, by 
>> which time it'll already have a tape ready 
>> and writing for the first job, so it'll use that one.
>>
>> I'd still be keen on some sort of understanding of what's going on in the 
>> first place though; my lack of understanding 
>> disturbs me.
> 
> What "Max Volume jobs" is set for 000006L4?  It seems to think this has been
> exceeded immediately after it was recycled.
Set to 1 for both jobs, so it is exceeded as soon as the volume starts getting 
written to.  That has always seemed 
correct behaviour to me; am I missing something?

I must say this setting combined with observed behaviour confused me for a 
while too (i.e. why was it interleaving with 
that set), but I found some comments on the mailing list a few years back on 
race conditions and not using "max volume 
jobs" to avoid interleaving which made some sense.  And I expect I've run into 
much the same sort of race condition, 
with some extra mildly unfortunate consequences.

Anyway, my original intent was one job per tape, and that's what I'm back to 
now.  The solution in this case was to set 
Maximum Concurrent Jobs = 1 on the Drive configuration, as the most reliable 
(to my mind) way to avoid interleaving, 
which means each job ends up using the volume it reserved when it started.  
That worked properly last night, so I'm 
pretty happy now.  Hope that helps someone else someday.

Craig

------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users