I am having an issue with 2.4.2 and improper tape reservation behavior.
The autochanger is a 50 slot, two LTO drive library.
I had a long running job (job id 1898) using tape SL1110. Jobs with
the same schedule that ran immediately prior to the job in question
used tapes SL1110 and SL1104. Job id 1899, among several other jobs,
wanted SL1110 even though the tape was in use by job id 1898. It
SHOULD have reserved SL1104, because it was appendable, in the correct
Pool, and it was in the library. I tried responding using a mount of
SL1104 on Drive0 with no luck. Nothing would mount.
Unfortunately, I did not, however, make a copy of :
a) the output of the result of the mount request or
b) status of the storage daemon. E.g.
bconsole> status storage=SL500
Further below is a section of output from of the storage daemon with
debug level 100 with trace (bconsole> setdebug level=100 trace=1
storage=SL500).
Note: I enabled debug on the SD /after/ I got the operator request for
loading SL1110.
The long running job (1898) that was using SL1110 at the time, filled
up SL1110 and went on to SL1104 and SL1106. After it was done with
SL1110, the next job, 1900, continued without operator intervention,
by using SL1106. All three tapes (SL1110, SL1104, SL1106) are in the
same media pool, CorpWeekly.
Most of the time, things work as I'd expect: Two jobs run in parallel,
each using a different tape and drive, using APPENDable media from the
appropriate Pool. However, every so often this issue comes up. I'd
estimate, every other week. I upgraded from 2.2.8 to 2.4.2 on Tuesday
because I suspected that the storage daemon improvements for
multi-drive libraries would help with this issue. However, it doesn't
seem to have made a difference - or this this issue outright.
I've looked through Bacula Mantis, without seeing anything indicating
there is an existing bug on
2.4.2.
Can someone provide some ideas/suggestions about what is taking place
here? It not perfectly repeatable, so I can't really file a useful
bug.
Many thanks,
Robb
STORAGE DAEMON DEBUG LEVEL 100 TRACE OUTPUT
badger-sd: dircmd.c:231-0 Do command: use storage=
badger-sd: reserve.c:695-0 jid=1899 <dird: use storage=SL500
media_type=lto2 pool_name=CorpWeekly pool_type=Backup append=1 copy=0
stripe=0
badger-sd: reserve.c:724-0 jid=1899 <dird device: use device=SL500
badger-sd: reserve.c:741-0 jid=1899 Storage=SL500 media_type=lto2
pool=CorpWeekly pool_type=Backup append=1
badger-sd: reserve.c:743-0 jid=1899 Device=SL500
badger-sd: reserve.c:919-0 jid=1899 PrefMnt=0 exact=0 suitable=0 chgronly=1
badger-sd: reserve.c:1084-0 jid=1899 search res for SL500
badger-sd: reserve.c:1087-0 jid=1899 Try match changer res=SL500
badger-sd: reserve.c:1098-0 jid=1899 Try changer device Drive0
badger-sd: reserve.c:1161-0 jid=1899 chk MediaType device=lto2 request=lto2
badger-sd: reserve.c:1184-0 try reserve Drive0
badger-sd: reserve.c:1198-0 jid=1899 have_vol=0 vol=
badger-sd: reserve.c:1381-0 jid=1899 reserve_append device is "Drive0"
(/dev/nst1)
badger-sd: reserve.c:1459-0 jid=1899 PrefMnt=0 exact=0 suitable=1
chgronly=1 any=0
badger-sd: reserve.c:1430-0 MaxJobs=0 Jobs=0 reserves=0 Status= Vol=
badger-sd: reserve.c:1542-0 jid=1899 OK Res Unused autochanger
"Drive0" (/dev/nst1).
badger-sd: reserve.c:481-0 Inc reserve=0 dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:1208-0 jid=1899 Reserved=1 dev_name=SL500
mediatype=lto2 pool=CorpWeekly ok=1
badger-sd: askdir.c:259-0 dir_find_next_appendable_
volume: reserved=1 Vol=
badger-sd: askdir.c:276-0 >dird CatReq
Job=lrsea-www001.2008-09-20_06.05.06 FindMedia=1 pool_name=CorpWeekly
media_type=lto2
badger-sd: askdir.c:182-0 <dird 1000 OK VolName=SL1110 VolJobs=10
VolFiles=264 VolBlocks=4031870 VolBytes=260104061952 VolMounts=6
VolErrors=0 VolWrites=9098970 MaxVolBytes=0 VolCapacityBytes=0
VolStatus=Append Slot=33 MaxVolJobs=0 MaxVolFiles=0 InChanger=1
VolReadTime=0 VolWriteTime=3828589585 EndFile=263 EndBlock=15499
VolParts=0 LabelType=0 MediaId=17
badger-sd: askdir.c:205-0 do_reqest_vol_info return true slot=33 Volume=SL1110
badger-sd: reserve.c:472-0 jid=1899 find_vol=SL1110 found=1
badger-sd: reserve.c:189-0 jid=1899 List find_volume: SL1104 in_use=1
on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List find_volume: SL1110 in_use=1
on device "Drive0" (/dev/nst1)
badger-sd: reserve.c:641-0 jid=1899 Vol=SL1110 on same dev.
badger-sd: askdir.c:285-0 Call reserve_volume. Vol=SL1110
badger-sd: reserve.c:316-0 jid=1899 enter reserve_volume=SL1110
drive="Drive0" (/dev/nst1)
badger-sd: reserve.c:189-0 jid=1899 List begin reserve_volume: SL1104
in_use=1 on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List begin reserve_volume: SL1110
in_use=1 on device "Drive0" (/dev/nst1)
badger-sd: reserve.c:331-0 jid=1899 Vol attached=SL1110, newvol=SL1110
volinuse=1 on "Drive0" (/dev/nst1)
badger-sd: reserve.c:339-0 jid=1899 === set reserved vol=SL1110
dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:418-0 jid=1899 === set in_use. vol=SL1110
dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:189-0 jid=1899 List end new volume: SL1104
in_use=1 on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List end new volume: SL1110
in_use=1 on device "Drive0" (/dev/nst1)
badger-sd: askdir.c:292-0 dir_find_next_appendable_volume return true.
vol=SL1110
badger-sd: reserve.c:1220-0 jid=1899 looking for Volume=SL1110
badger-sd: reserve.c:1279-0 jid=1899 >dird changer: 3000 OK use device
device=Drive0
badger-sd: reserve.c:1107-0 jid=1899 Device Drive0 reserved=1 for append.
badger-sd: reserve.c:1053-0 jid=1899 available device found=SL500
badger-sd: reserve.c:1068-0 OK dev found. Vol=SL1110
badger-sd: dircmd.c:217-0 <dird: run
STORAGE DAEMON CONFIG
<snip>
Autochanger {
Name = SL500
Device = Drive0
Device = Drive1
Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
Changer Device = /dev/sg0
}
Device {
Name = Drive0
Drive Index = 0
Media Type = lto2
Archive Device = /dev/nst1
AutomaticMount = yes;
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
AutoChanger = yes
Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
}
Device {
Name = Drive1
Drive Index = 1
Media Type = lto2
Archive Device = /dev/nst0
AutomaticMount = yes;
AlwaysOpen = yes;
RemovableMedia = yes;
RandomAccess = no;
AutoChanger = yes
Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
}
<snip>
DIRECTOR CONFIG
<snip>
Storage {
Name = SL500
Address =
backup.corp.uievolution.com
SDPort = 9103
Password = "xxx"
Device = SL500
Autochanger = yes
Media Type = lto2
Maximum Concurrent Jobs = 2
}
<snip>
OPERATOR EMAIL
20-Sep 10:51 badger-sd JobId 1896: Please mount Volume "SL1110" or
label a new one for:
- Hide quoted text -
Job: cwseapfs001.2008-09-20_06.05.
03
Storage: "Drive0" (/dev/nst1)
Pool: CorpWeekly
Media type: lto2