Bacula-users

[Bacula-users] Multi client multy storaje multy job configuration

2012-03-24 04:27:03
Subject: [Bacula-users] Multi client multy storaje multy job configuration
From: Anton Nikiforov <anton AT nikiforov DOT ru>
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 24 Mar 2012 12:08:36 +0400
Dear ALL!

I'm new to this forum but using Bacula for years and happy with it.

But now i reach somekind of trouble in Bacula Configuration:

I have the following:

Director:
Director {
        Name = dir
        QueryFile = "/etc/bacula/scripts/query.sql"
        WorkingDirectory = "/var/lib/bacula"
        PidDirectory = "/var/run/bacula"
        Maximum Concurrent Jobs = 50
        Password = "XXXXX"
        Messages = Daemon
        DirAddresses = {
                ip =  {
                        addr = 172.16.248.1
                        port = 9101
                }
        }
        FD Connect Timeout = 120 seconds
        SD Connect Timeout = 120 seconds
        Statistics Retention = 90 days
}

4 storages:
Storage {
  Name = storage1
  WorkingDirectory = "/var/lib/bacula"
  Pid Directory = "/var/run/bacula"
  Maximum Concurrent Jobs = 50
        Heartbeat Interval = 30 seconds
        Client Connect Wait = 1 minute
        SDAddresses = {
                ip =  {
                        addr = 172.16.248.1
                        port = 9103
                }
        }
}
(all 4 storages have different adresses and run on different machines)

With one Device on each storage:
Device {
        Name = storage1-device1
        Media Type = File
        Archive Device = /bacula/archive01/
        LabelMedia = yes;
        Random Access = Yes;
        AutomaticMount = yes;
        RemovableMedia = no;
        AlwaysOpen = no;
        Maximum Volume Size = 4699Mb
        Maximum Concurrent Jobs = 1
        Maximum File Size = 4699Mb
        Maximum Network Buffer Size = 262144
}

And 6 clients:
FileDaemon {
  Name = client1
  FDport = 9102
  WorkingDirectory = /var/lib/bacula
  Pid Directory = /var/run/bacula
  Maximum Concurrent Jobs = 10
  FDAddress = 172.16.248.1
}

Each client should be backed up on each storage. So i have jobs for that
like this:
Job {
        Name = "daily-db-client1-backup-on-storage1"
        Type = Backup
        Client = client1
        FileSet="db-fileset"
        Storage = storage1
        Pool = Default
        Full Backup Pool = Full-Pool
        Incremental Backup Pool = Inc-Pool
        Differential Backup Pool = Diff-Pool
        RunScript {
                Command = "/usr/local/bin/database-backup.sh"
                RunsOnClient = Yes
                RunsWhen = Before
        }
        Messages = Daemon
        Schedule = "Schedule1"
        Maximum Concurrent Jobs = 2
}

The main problem is that i can make configuration (shown) where i have
concurrently backed up all clients on all storages, but this
configuration makes possible for director to start backing up the same
client concurrently. When this happend (database backups script starts
twice or even 4 times) the dump of the database becoming unusable. And
servers hang dumping bases for a very long time.

When i decrease number of concurrent jobs on the client to 1 or 2 - i
reach situation when all jobs are "is waiting on max Client jobs" and
some of them are "is waiting on Storage storeage1" (or storage2, or
storage3 or storage4). And server hangs forever waiting jobs to finish.

When i increase client jobs back to 10 and decrease director jobs to 1 -
i'm getting non-concurrent configuration that is fine, but takes too
much time to finish all jobs. More than i can allow.

Could you please help me in resolving my issue? I need to run 4 jobs
concurrently (4 storages runs concurrently) but i need them to run from
4 different clietns. The situation when 2 storages backup the same
client is imposible. It makes backups unusable completley.

The solution to check concurrent script run inside script itself is not
the solution. It is the same as no concurrency.

Sorting jobs is not the solution. Depending on job level and server
load... etc... i cannot relay on this configuration. And i got the
result when two jobs from one client starts at the same time during tests.

Best regards,
Anton Nikiforov

------------------------------------------------------------------------------
This SF email is sponsosred by:
Try Windows Azure free for 90 days Click Here 
http://p.sf.net/sfu/sfd2d-msazure
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users