Bacula-users

Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1

2012-11-05 13:47:03
Subject: Re: [Bacula-users] wanted on DEVICE-0, is in use by device DEVICE-1
From: Stephen Thompson <stephen AT seismo.berkeley DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 05 Nov 2012 10:42:22 -0800
On 11/05/12 08:03, Stephen Thompson wrote:
>
>
> On 11/5/12 7:59 AM, John Drescher wrote:
>>> I've had the following problem for ages (meaning multiple major
>>> revisions of bacula) and I've seen this come up from time to time on the
>>> mailing list, but I've never actually seen a resolution (please point me
>>> to one if it's been found).
>>>
>>>
>>> background:
>>>
>>> I run monthly Fulls and nightly Incrementals.  I have a 2 drive
>>> autochanger dedicated to my Incrementals.  I launch something like ~150
>>> Incremental jobs each night.  I am configured for 8 concurrent jobs on
>>> the Storage Daemon.
>>>
>>>
>>> PROBLEM:
>>>
>>> The first job(s) grab one of the 2 devices available in the changer
>>> (which is set to AutoSelect) and either load a tape, or use a tape from
>>> the previous evening.  All tapes in the changer are in the same
>>> Incremenal-Pool.
>>>
>>> The second jobs(s) grab the other of the 2 devices available in the
>>> changer, but want to use the same tape that's just been mounted (or put
>>> into use) on the jobs that got launched first.  They will often literal
>>> wait the entire evening until 100's of jobs run through on only one
>>> device, until that tape is freed up, at which point it is unmounted from
>>> the first device and moved to the second.
>>>
>>> Note, the behaviour seems to be to round-robin my 8 concurrency limit
>>> between the 2 available drives, which mean 4 jobs will run, and 4 jobs
>>> will block on waiting for the wanted Volume.  When the original 4 jobs
>>> are completed (not at the same time) additional jobs are launched that
>>> keep that wanted Volume in use.
>>>
>>>
>>> LOG:
>>>
>>> 03-Nov 22:00 DIRECTOR JobId 267433: Start Backup JobId 267433, Job=JOB.
>>> 2012-11-03_22.00.00_0403-Nov 22:00 DIRECTOR JobId 267433: Using Device
>>> "L100-Drive-0"03-Nov 22:00 DIRECTOR JobId 267433: Sending Accurate
>>> information.
>>> 03-Nov 22:00 sd_L100_ JobId 267433: 3307 Issuing autochanger "unload
>>> slot 82, drive 0" command.
>>> 03-Nov 22:06 lawson-sd_L100_ JobId 267433: Warning: Volume "IM0108"
>>> wanted on "L100-Drive-0" (/dev/L100-Drive-0) is in use by device
>>> "L100-Drive-1" (/dev/L100-Drive-1)
>>> 03-Nov 22:09 sd_L100_ JobId 267433: Warning: Volume "IM0108" wanted on
>>> "L100-Drive-0" (/dev/L100-Drive-0) is in use by device "L100-Drive-1"
>>> (/dev/L100-Drive-1)
>>> 03-Nov 22:09 sd_L100_ JobId 267433: Warning: mount.c:217 Open device
>>> "L100-Drive-0" (/dev/L100-Drive-0) Volume "IM0108" failed: ERR=dev.c:513
>>> Unable to open device "L100-Drive-0" (/dev/L100-Drive-0): ERR=No medium
>>> found
>>> .
>>> .
>>> .
>>>
>>>
>>> CONFIGS (partial and seem pretty straight-forward):
>>>
>>> Schedule {
>>>      Name = "DefaultSchedule"
>>>      Run = Level=Incremental                               sat-thu at 22:00
>>>      Run = Level=Differential                              fri     at 22:00
>>> }
>>>
>>> JobDefs {
>>>      Name = "DefaultJob"
>>>      Type = Backup
>>>      Level = Full
>>>      Schedule = "DefaultSchedule"
>>>      Incremental Backup Pool = Incremental-Pool
>>>      Differential Backup Pool = Incremental-Pool
>>> }
>>>
>>> Pool {
>>>      Name = Incremental-Pool
>>>      Pool Type = Backup
>>>      Storage = L100-changer
>>> }
>>>
>>> Storage {
>>>      Name = L100-changer
>>>      Device = L100-changer
>>>      Media Type = LTO-3
>>>      Autochanger = yes
>>>      Maximum Concurrent Jobs = 8
>>> }
>>>
>>> Autochanger {
>>>      Name = L100-changer
>>>      Device = L100-Drive-0
>>>      Device = L100-Drive-1
>>>      Changer Device = /dev/L100-changer
>>> }
>>>
>>> Device {
>>>      Name = L100-Drive-0
>>>      Drive Index = 0
>>>      Media Type = LTO-3
>>>      Archive Device = /dev/L100-Drive-0
>>>      AutomaticMount = yes;
>>>      AlwaysOpen = yes;
>>>      RemovableMedia = yes;
>>>      RandomAccess = no;
>>>      AutoChanger = yes;
>>>      AutoSelect = yes;
>>> }
>>>
>>> Device {
>>>      Name = L100-Drive-1
>>>      Drive Index = 0
>>>      Media Type = LTO-3
>>>      Archive Device = /dev/L100-Drive-1
>>>      AutomaticMount = yes;
>>>      AlwaysOpen = yes;
>>>      RemovableMedia = yes;
>>>      RandomAccess = no;
>>>      AutoChanger = yes;
>>>      AutoSelect = yes;
>>> }
>>>
>>
>> I do not have a good solution but I know by default bacula does not
>> want to load the same pool into more than 1 storage device at the same
>> time.
>>
>> John
>>
>
> I think it's something in the automated logic.  Because if I launch jobs
> by hand (same pool across 2 tapes devices in same autochanger)
> everything works fine.  I think it has more to do with the Scheduler
> assigning the same same Volume to all jobs and then not wanting to
> change that choice if that Volume is in use.
>

I also use Accurate backups which can sometimes take a bit before the 
job get's back to volume/drive assignments, so it might be a race 
condition where when the blocking jobs start they still want the same 
Volume as the jobs that run, because the jobs that run are still setting 
up Accurate backup and haven't been solidly assigned that Volume yet.  I 
don't know.  It's rather annoying, especially as we attempt to ramp up 
our backup capacity.

Lastly, it doesn't ALWAYS happen, though it does seem to happen more 
often than not.



> If I do a status on the Director for instance and see the jobs for the
> next day lined up in Scheduled jobs, they all have the same Volume listed.
>
> thanks,
> Stephen
>


-- 
Stephen Thompson               Berkeley Seismological Laboratory
stephen AT seismo.berkeley DOT edu    215 McCone Hall # 4760
404.538.7077 (phone)           University of California, Berkeley
510.643.5811 (fax)             Berkeley, CA 94720-4760

------------------------------------------------------------------------------
LogMeIn Central: Instant, anywhere, Remote PC access and management.
Stay in control, update software, and manage PCs from one command center
Diagnose problems and improve visibility into emerging IT issues
Automate, monitor and manage. Do more in less time with Central
http://p.sf.net/sfu/logmein12331_d2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users