Bacula-users

Re: [Bacula-users] Tape Drive Contention [WORKAROUND]

2011-04-27 16:17:11
Subject: Re: [Bacula-users] Tape Drive Contention [WORKAROUND]
From: "Kurzawa, Greg" <GKurzawa AT pamida DOT com>
To: <bacula-users AT lists.sourceforge DOT net>
Date: Wed, 27 Apr 2011 15:11:54 -0500

So the problem as described below is that if I’m trying to create two or more VirtualFulls to the same Storage Resource at the same time, the first Job grabs both the Devices in my Autochanger (one to read and one to write), leaving the second Job to die.  One way to get around this is to stagger my VirtualFull Jobs.  I don’t want to do that because 1) that makes my bacula-dir.conf larger and more complex, and 2) I want those VirtualFull Jobs running one after the other without any wasted time between and without having to guesstimate the duration of each Job.

 

I worked around this issue by writing a simple Perl script which I schedule from cron.  The script keeps a list of all the Clients for which I want a VirtualFull.  It then creates the VirtualFull Job for one Client at a time, checking at five minute intervals for running Bacula jobs.  If there are running jobs, it waits, if not, it kicks off a VirtualFull for the next Client.  The script is below in case anyone else is dealing with the same issue and happens to be interested in my little workaround.

 

Greg

 

 

 

 

#!/usr/bin/perl

 

use DBI();

 

 

 

###

### VARIABLES

 

my @CLIENTS = qw (

 

   client1

   client2

   client3

   client4

   client5

 

);

 

my $srv = $ARGV[0];

my ( $user, $password ) = ( "bacula", "" );

my $dbh = DBI->connect("DBI:mysql:database=bacula;host=$srv", $user, $password);

 

my $query = "SELECT COUNT(j.JobId)

FROM Job j

WHERE j.JobStatus='R'";

 

my $sth = $dbh->prepare ( $query );

 

### END VARIABLES

###

 

 

 

### sub to create a VirtualFull

### through the Bacula console

 

sub CreateVirtualFull {

 

my $Client = $_[0];

 

@CONSOLE = `/opt/bacula/sbin/bconsole <<END

run job="DAILY:$Client" level=VirtualFull yes

END`;

 

}

 

 

 

### sub to query the SQL database

### for running jobs

 

sub CheckRunningJobs {

 

$sth->execute();

my @DATA = "">

my $runningJobs = $DATA[0];

 

return $runningJobs;

 

}

 

 

 

### go through the Client list

### create a VirtualFull for each Client one at a time

 

foreach $Client ( @CLIENTS ) {

 

   ## if no jobs are running, make the VirtualFull

   ## otherwise wait 1 minute before trying again

 

   my $runningJobs = 1;

 

   until ( $runningJobs == 0 ) {

 

        $runningJobs = CheckRunningJobs();

        unless ( $runningJobs == 0 ) {

           print "there are running jobs; sleeping ... \n";

           sleep 60;

        }

 

   }

 

   print "no running jobs; executing Job for $Client\n";

   CreateVirtualFull($Client);

   sleep 60;

 

}

 

 

 

$sth->finish();

$dbh->disconnect();

 

exit 0;

 

 

From: Kurzawa, Greg [mailto:GKurzawa AT pamida DOT com]
Sent: Monday, April 25, 2011 10:36 AM
To: bacula-users AT lists.sourceforge DOT net
Subject: [Bacula-users] Tape Drive Contention

 

Hi everyone,

 

I've got an LTO drive utilization/scheduling problem that I'm hoping someone can help with.

 

I have a Schedule shared by two different Clients.  This schedule runs a daily incremental at 20:00, and a Virtual Full Mondays at 08:00.  The dailys work great; the part I'm having problems with is the Virtual Full.

 

Both Clients have their own disk Pools, but share a tape Pool.

 

I have an autochanger with two LTO4 drives:

 

Autochanger {

  name = "TS3310"

  device = ULT3580-TD4_0, ULT3580-TD4_1

  changer device = /dev/changer

  changer command = "/opt/bacula/sbin/mtx-changer %c %o %S %a %d"

}

 

Device {

  name = "ULT3580-TD4_0"

  media type = LTO4

  changer device = /dev/changer

  archive device = /dev/nst0

  drive index = 0

  autochanger = yes

}

 

Device {

  name = "ULT3580-TD4_1"

  media type = LTO4

  changer device = /dev/changer

  archive device = /dev/nst1

  drive index = 1

  autochanger = yes

}

 

When the Virtual Full Jobs start, the first Client grabs both Devices, one to read from and one to write to.

The second Client promptly fails when it tries to grab a Device for reading:

 

Fatal error: acquire.c:166 No suitable device found to read Volume

 

What I was expecting was that the second job would see that both LTO4 Devices are in use and wait a certain amount of time before failing.  Does anyone know how I can make this happen short of staggering my schedules?

 

Greg

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>