Bacula-users

[Bacula-users] duplicate job storage device bug?

2013-08-04 01:00:23
Subject: [Bacula-users] duplicate job storage device bug?
From: Stephen Thompson <stephen AT seismo.berkeley DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 03 Aug 2013 19:01:48 -0700

Hey all,

Figured I'd throw this out there before opening a ticket in case this is 
already known or I'm just confused.

We use duplicate job control for the following reason:  We run nightly 
Incrementals of _all_ jobs.  Then rather than running Fulls on a cyclic 
schedule, we run them back-to-back, injecting a few at a time via 
scripts.  Note, we also have two tape libraries (and two SDs), one for 
Incremental Pools and one for Full Pools.

Where duplicate job control comes in is that we want a running 
Incremental to be canceled if a Full of the same job is launched on any 
given night since the Full, in our case, should take precedence and be 
run immediately.  What we see is that the Full does indeed cancel the 
running Incremental and then runs itself, HOWEVER the Full job takes on 
the storage properties (storage device) of the canceled Incremental job 
rather than using it's own settings.  The Full job then expects its Full 
Pool tape to be in the Incremental tape library, which it is not, and 
the job stalls for operator intervention.

Here's some config snippets:

   Maximum Concurrent Jobs = 2
   Allow Duplicate Jobs = no
   Cancel Lower Level Duplicates = yes
   Cancel Running Duplicates = no
   Cancel Queued Duplicates = no

Log snippets:

(incremental launches)
03-Aug 04:05 DIRECTOR JobId 316646: Start Backup JobId 316646, 
Job=CLIENT.2013-08-02_22.01.01_50
03-Aug 04:05 DIRECTOR JobId 316646: Using Device "L100-Drive-0" to write.

(full launches and cancels incremental)
03-Aug 06:20 DIRECTOR JobId 316677: Cancelling duplicate JobId=316646.
03-Aug 06:20 DIRECTOR JobId 316677: 2001 Job 
sutter_5.2013-08-02_22.01.01_50 marked to be canceled.
03-Aug 06:20 DIRECTOR JobId 316677: Cancelling duplicate JobId=316646.
03-Aug 06:20 DIRECTOR JobId 316677: 2901 Job 
sutter_5.2013-08-02_22.01.01_50 not found.
03-Aug 06:20 DIRECTOR JobId 316677: 3904 Job 
sutter_5.2013-08-02_22.01.01_50 not found.
03-Aug 08:20 DIRECTOR JobId 316677: Start Backup JobId 316677, 
Job=sutter_5.2013-08-03_06.20.02_04

(full complains that volume is tried to load is incremental tape instead 
of full tape)
03-Aug 08:22 DIRECTOR JobId 316677: Using Device "L100-Drive-0" to write.
03-Aug 08:22 SD_L100_ JobId 316677: 3304 Issuing autochanger "load slot 
72, drive 0" command.
03-Aug 08:23 SD_L100_ JobId 316677: 3305 Autochanger "load slot 72, 
drive 0", status is OK.
03-Aug 08:23 SD_L100_ JobId 316677: Warning: Director wanted Volume 
"FB0718".
     Current Volume "IM0097" not acceptable because:
     1998 Volume "IM0097" catalog status is Full, not in Pool.

NOTE: Full job launch command was "run job=sutter_5 level=Full 
storage=SL500-Drive-1 yes" and yet, apparently, due to the job duplicate 
cancellation, the Full job instead attempted to use "storage=L100-Drive-0".


thanks,
Stephen
-- 
Stephen Thompson               Berkeley Seismological Laboratory
stephen AT seismo.berkeley DOT edu    215 McCone Hall # 4760
510.214.6506 (phone)           University of California, Berkeley
510.643.5811 (fax)             Berkeley, CA 94720-4760

------------------------------------------------------------------------------
Get your SQL database under version control now!
Version control is standard for application code, but databases havent 
caught up. So what steps can you take to put your SQL databases under 
version control? Why should you start doing it? Read more to find out.
http://pubads.g.doubleclick.net/gampad/clk?id=49501711&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] duplicate job storage device bug?, Stephen Thompson <=