bumping this thread because I have another issue I am unable to resolve.
I am running:
bacula 7.0.5 compiled from source
3 pools.
Multiple jobs per pool.
10 tape drives
1 autochanger for those 10 drives.
1600 slots
running concurrent jobs works.
I have 41 volumes(tapes) in the virtual changer with 10 drives.
Bacula is getting confused very quickly on which volume is in which slot when comparing whats in the bacula database,
as shown by select query, compared to the output of mtx-changer —list all command. See below.
So as these volumes get more and more out of sync my jobs get hung with waiting to mount volume which cascades to exceeding max jobs
as the running jobs start to queue. I’m wondering if this is a bug due to the number of slots.
The fix is to release all 10 drives, run update slots. But I am having to do this daily now. Unacceptable.
When I was setting this up I started with only 10 virtual tapes or volumes and things ran smoothly.
Now I have added a 31 more tapes to prevent jobs from failing, if all the tapes in a pool become full, and the problem has been exacerbated by
the increasing the number volumes.
It seems bacula is unable to handle this many slots and as it moves a volume back to the slot from whence it came to pull another for a different job
and it puts the tape(volume) back into a different slot but gets confused and Slot in the database is now wrong for this particular volume.
Does anyone have any experience with this? And is there a possible configuration fix?
My max concurrent jobs is 20 for SD and max concurrent jobs per tapedrive is 1. This to keep jobs from interleaving data when they are writing concurrently.
I will be happy to submit configuration files if that is helpful.
Thanks
Volumes in red are the volumes bacula is now out of sync with. Every one of these volumes is currently in a tape-drive. I have edited the output below for simplicity in comparing mtx-changer command vs mysql select query.
Select VolumeName,Slot from Media order by slot; |
slot |
out put from mtx changer --list all |
AAAC236886 |
1 |
AAAC236886 |
AAAACC5F69 |
2 |
AAAACC5F69 |
AAAACE5F6B |
3 |
AAAACE5F6B |
AAAAD55F70 |
4 |
AAAAD55F70 |
AAAB1763B2 |
5 |
AAAB1763B2 |
AAAACA5F6F |
6 |
AAAACA5F6F |
AAAACD5F68 |
7 |
AAAACD5F68 |
AAAB0B63AE |
8 |
AAAB0B63AE |
AAAB1463B1 |
9 |
AAAB1463B1 |
AAAACF5F6A |
10 |
AAAACF5F6A |
AAAAC95F6C |
11 |
AAAAC95F6C |
AAAB0963AC |
12 |
AAAB0963AC |
AAAAC85F6D |
13 |
AAAAC85F6D |
AAAAD45F71 |
14 |
AAAAD45F71 |
AAAC2D6888 |
15 |
AAAC226887 |
AAACDA687F |
16 |
AAAC216884 |
AAAB0863AD |
17 |
AAAB0863AD |
AAAC256880 |
18 |
AAAC256880 |
AAAB0F63AA |
19 |
AAAB0F63AA |
AAAC226887 |
20 |
AAADBB691E |
AAAC206885 |
21 |
AAAC206885 |
AAAB1563B0 |
22 |
AAAB1563B0 |
AAAB1663B3 |
23 |
AAAB1663B3 |
AAAB3B609E |
24 |
AAACDA687F |
AAAB0E63AB |
25 |
AAAB0E63AB |
AAADB06915 |
26 |
AAADB06915 |
AAAB0A63AF |
27 |
AAAB0A63AF |
AAAC216884 |
28 |
AAAB3B609E |
AAAACB5F6E |
29 |
AAAACB5F6E |
AAAC246881 |
30 |
AAAC246881 |
AAAC276882 |
31 |
AAAC276882 |
AAAC266883 |
32 |
AAAC2D6888 |
AAADB9691C |
33 |
AAADB9691C |
AAADBD6918 |
34 |
AAADBD6918 |
AAADB8691D |
35 |
AAADB8691D |
AAADB36916 |
36 |
AAADB36916 |
AAADBE691B |
37 |
AAADBE691B |
AAADBB691E |
38 |
AAAC266883 |
AAADBF691A |
39 |
AAADBF691A |
AAADBC6919 |
40 |
AAADBC6919 |
AAADB26917 |
41 |
AAADB26917 |
OK, I was able to get this working.
One of the major factors was upgrading my version of bacula from 5.2.6 to 7.0.5.
It appears there is a bug in 5.2.6 in which bacula gets confused with regards to which tape(volume) is in which drive and/or slot.
with 5.2.6, with my config files working, after backups ran for a few days the slots no longer matched in bacula with what mtx-changer listall
told me from command line and I was getting lots of errors and backups hanging.
After upgrading to 7.0.5 I no longer had this problem and my backups are running smoothly on 10 different drives using 3 different pools.
All is good for me here using multi drive configuration.
Thanks to Ana for extensive help to get this working.
Jared
Thanks for your reply.
My autochanger device definition in bacula-sd.conf is as follows. I removed unrelated changer commands in Drive resource directives as you suggested, maybe those direcives mess up bacula, I will see what happens.
Autochanger {
Name = AYKAutochanger
Device = Drive-1
Device = Drive-2
Device = Drive-3
Device = Drive-4
Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"
Changer Device = /dev/sg9
}
Best regards,
On 25-11-2014 18:25, Ana Emília M. Arruda wrote:
Hello,
Do you have an autochanger device definition in your bacula-sd.conf?
Autochanger {
Name = "AYKAutochanger"
Device = Drive-1, Drive-2, Drive-3, Drive-4
Changer Device = /dev/sg9
Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"
}
In your Drives definitions, you don´t need to have Changer Device nor Changer Command defined. The bellow lines should be removed from your drives definitions:
Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"
# Changer Device = /dev/changer
Changer Device = /dev/sg9
AutoChanger = yes
Best regards,
Ana
|