Any reason for not updating to v7 Bacula? It contains a number of fixes as
well as new features. The version that you are running is nearly 2 years old,
although there were a few bug fixes along the way – however no updates since
April 2014.
Patti Clark
Linux System Administrator
R&D Systems Support Oak Ridge National Laboratory
From: Robert Heinzmann <r.heinzmann AT freelancer.traviangames DOT
com<mailto:r.heinzmann AT freelancer.traviangames DOT com>>
Date: Tuesday, March 3, 2015 at 3:37 AM
To: "bacula-users AT lists.sourceforge DOT net<mailto:bacula-users AT
lists.sourceforge DOT net>" <bacula-users AT lists.sourceforge DOT
net<mailto:bacula-users AT lists.sourceforge DOT net>>
Subject: [Bacula-users] Bacula SD 5.2.13 crash - Mutex lock failure.
ERR=Invalid argument
Hello,
we are using Bacula 5.2.13-18 on CentOS6 and from time to time bacula-sd
crashes with, causing all backups to fail until bacula-sd is started again:
Mar 3 06:59:00 XXXX bacula-sd: XXXX:storage:default: ABORTING due to ERROR in
lockmgr.c:100#012Mutex lock failure. ERR=Invalid argument
Mar 3 06:59:00 XXXX bacula-sd: Bacula interrupted by signal 6: IOT trap
Setup:
3 Servers:
1 Bacula Director (extra machine)
1 Bacula Catalog Server (extra machine)
1 Bacula Storage Deamon (extra machine)
We have ~573 Jobs (some TB, all Full Backups) to backup each day. Jobs are
distributed across the day depending on minimum load of the server, distributed
evenly otherwise:
Time Jobs
0:00-1:00 35
1:00-2:00 121
2:00-3:00 93
3:00-4:00 60
4:00-5:00 46
5:00-6:00 71
6:00-7:00 60
7:00-8:00 43
8:00-9:00 32
9:00-10:00 12
10:00-11:00 7
11:00-12:00 3
12:00-13:00 5
13:00-14:00 2
14:00-15:00 7
15:00-16:00 8
16:00-17:00 7
17:00-18:00 3
18:00-19:00 2
19:00-20:00 3
20:00-21:00 11
21:00-22:00 14
22:00-23:00 28
23:00-24:00 25
Our SD is configured with 20 virtual drives in a backup2disk setup allowing 20
concurrent backups to disk. Each Backup Job is an individual file in the
backend (so full backups can be accessed and restored through bls/bextract). We
have an external “scripted” job, which cleans up unused / purged volumes from
disk.
Bacula Director Configuration:
------------------------------
Storage {
Name = "XXXX:storage:default"
Address = HOSTNAME_OF_THE_SD_MACHINE
Password = "SECRET"
Device = "FileStorage"
Maximum Concurrent Jobs = 20
Media Type = File
Heartbeat Interval = 15
TLS Enable = no
}
Pool {
Name = " HOSTNAME_OF_THE_SD_MACHINE:pool:default"
Storage = "XXXX:storage:default"
# All Volumes will have the format standard.date.time to ensure they
# are kept unique throughout the operation and also aid quick analysis
# We won't use a counter format for this at the moment.
Label Format =
"BACULA-${Job}.${Year}${Month:p/2/0/r}${Day:p/2/0/r}.${Hour:p/2/0/r}${Minute:p/2/0/r}.${JobId}"
Pool Type = Backup
# Clean up any we don't need, and keep them for a maximum of a month (in
# theory the same time period for weekly backups from the clients)
# Note the files for the old volumes will still remain on the disk but will
# be truncated to a zero size.
Recycle = No
Auto Prune = Yes
Action On Purge = Truncate
Volume Retention = 30 days
# Don't allow re-use of volumes; one volume per job only
Maximum Volume Jobs = 1
}
Bacula SD Configuration:
------------------------------
Autochanger {
Name = "FileStorage"
Changer Device = /dev/null
Changer Command = ""
Device = FileStorage-sd-0
Device = FileStorage-sd-1
Device = FileStorage-sd-2
Device = FileStorage-sd-3
Device = FileStorage-sd-4
Device = FileStorage-sd-5
Device = FileStorage-sd-6
Device = FileStorage-sd-7
Device = FileStorage-sd-8
Device = FileStorage-sd-9
Device = FileStorage-sd-10
Device = FileStorage-sd-11
Device = FileStorage-sd-12
Device = FileStorage-sd-13
Device = FileStorage-sd-14
Device = FileStorage-sd-15
Device = FileStorage-sd-16
Device = FileStorage-sd-17
Device = FileStorage-sd-18
Device = FileStorage-sd-19
Device = FileStorage-sd-20
}
Autochanger {
Name = "FileStorage-restore"
Changer Device = /dev/null
Changer Command = ""
Device = FileStorage-sd-restore-0
Device = FileStorage-sd-restore-1
Device = FileStorage-sd-restore-2
Device = FileStorage-sd-restore-3
Device = FileStorage-sd-restore-4
Device = FileStorage-sd-restore-5
Device = FileStorage-sd-restore-6
Device = FileStorage-sd-restore-7
Device = FileStorage-sd-restore-8
Device = FileStorage-sd-restore-9
Device = FileStorage-sd-restore-10
Device = FileStorage-sd-restore-11
Device = FileStorage-sd-restore-12
Device = FileStorage-sd-restore-13
Device = FileStorage-sd-restore-14
Device = FileStorage-sd-restore-15
Device = FileStorage-sd-restore-16
Device = FileStorage-sd-restore-17
Device = FileStorage-sd-restore-18
Device = FileStorage-sd-restore-19
Device = FileStorage-sd-restore-20
}
Backup Drives like this:
Device {
Name = FileStorage-sd-0 # Add a hyphen to SD/autochanger name & match
with drive index
Device Type = File
Media Type = File #unique to each archive device path, different path,
different mediatype
Archive Device = /bacula/data01
AutomaticMount = yes
AlwaysOpen = yes
RemovableMedia = yes
Autochanger = yes
Drive Index = 0
Maximum Concurrent Jobs = 1
Volume Poll Interval = 5
LabelMedia = yes
Spool Directory = /bacula/spool01
Autoselect = yes
Maximum Network Buffer Size = 65536
}
… 18 more…
Device {
Name = FileStorage-sd-20 # Add a hyphen to SD/autochanger name & match
with drive index
Device Type = File
Media Type = File #unique to each archive device path, different path,
different mediatype
Archive Device = /bacula/data01
AutomaticMount = yes
AlwaysOpen = yes
RemovableMedia = yes
Autochanger = yes
Drive Index = 20
Maximum Concurrent Jobs = 1
Volume Poll Interval = 5
LabelMedia = yes
Spool Directory = /bacula/spool01
Autoselect = yes
Maximum Network Buffer Size = 65536
}
Restore Drives like this:
Device {
Name = FileStorage-sd-restore-0 # Add a hyphen to SD/autochanger name &
match with drive index
Device Type = File
Media Type = File #unique to each archive device path, different path,
different mediatype
Archive Device = /bacula/data01
AutomaticMount = yes
AlwaysOpen = yes
RemovableMedia = yes
Autochanger = yes
Drive Index = 0
Maximum Concurrent Jobs = 1
Volume Poll Interval = 5
LabelMedia = yes
Spool Directory = /bacula/spool01
Autoselect = no
Maximum Network Buffer Size = 65536
}
Any idea what’s causing the bacula-sd crash ? how can be debug further ?
Regards,
Robert
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|