Bacula-users

Re: [Bacula-users] two drive autochanger and conflicting tape use

2008-10-03 12:33:24
Subject: Re: [Bacula-users] two drive autochanger and conflicting tape use
From: "Robb Wagoner" <robb AT dcexp DOT net>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 3 Oct 2008 09:30:15 -0700
I haven't seen any responses to my email from last Friday. Are other people running 2.4.x, with multi-drive libraries, experiencing Bacula issuing a mount request for a tape that is already in-use in another drive, or Bacula issuing a mount request for a tape that is already mounted, in the correct drive?

Thank you,
Robb


On Fri, Sep 26, 2008 at 9:55 PM, Robb Wagoner <robb AT dcexp DOT net> wrote:
I am having an issue with 2.4.2 and improper tape reservation behavior.

The autochanger is a 50 slot, two LTO drive library.

I had a long running job (job id 1898) using tape SL1110. Jobs with
the same schedule that ran immediately prior to the job in question
used tapes SL1110 and SL1104. Job id 1899, among several other jobs,
wanted SL1110 even though the tape was in use by job id 1898. It
SHOULD have reserved SL1104, because it was appendable, in the correct
Pool, and it was in the library. I tried responding using a mount  of
SL1104 on Drive0 with no luck. Nothing would mount.

Unfortunately, I did not, however, make a copy of :

a) the output of the result of the mount request or

b) status of the storage daemon. E.g.

   bconsole> status storage=SL500


Further below is a section of output from of the storage daemon with
debug level 100 with trace  (bconsole> setdebug level=100 trace=1
storage=SL500).
Note: I enabled debug on the SD /after/ I got the operator request for
loading SL1110.

The long running job (1898) that was using SL1110 at the time, filled
up SL1110 and went on to SL1104 and SL1106. After it was done with
SL1110, the next job, 1900, continued without operator intervention,
by using SL1106. All three tapes (SL1110, SL1104, SL1106) are in the
same media pool, CorpWeekly.

Most of the time, things work as I'd expect: Two jobs run in parallel,
each using a different tape and drive, using APPENDable media from the
appropriate Pool. However, every so often this issue comes up. I'd
estimate, every other week. I upgraded from 2.2.8 to 2.4.2 on Tuesday
because I suspected that the storage daemon improvements for
multi-drive libraries would help with this issue. However, it doesn't
seem to have made a difference - or this this issue outright.

I've looked through Bacula Mantis, without seeing anything indicating
there is an existing bug on 2.4.2.

Can someone provide some ideas/suggestions about what is taking place
here? It not perfectly repeatable, so I can't really file a useful
bug.

Many thanks,
Robb



STORAGE DAEMON DEBUG LEVEL 100 TRACE OUTPUT

badger-sd: dircmd.c:231-0 Do command: use storage=
badger-sd: reserve.c:695-0 jid=1899 <dird: use storage=SL500
media_type=lto2 pool_name=CorpWeekly pool_type=Backup append=1 copy=0
stripe=0
badger-sd: reserve.c:724-0 jid=1899 <dird device: use device=SL500
badger-sd: reserve.c:741-0 jid=1899 Storage=SL500 media_type=lto2
pool=CorpWeekly pool_type=Backup append=1
badger-sd: reserve.c:743-0 jid=1899     Device=SL500
badger-sd: reserve.c:919-0 jid=1899 PrefMnt=0 exact=0 suitable=0 chgronly=1
badger-sd: reserve.c:1084-0 jid=1899 search res for SL500
badger-sd: reserve.c:1087-0 jid=1899 Try match changer res=SL500
badger-sd: reserve.c:1098-0 jid=1899 Try changer device Drive0
badger-sd: reserve.c:1161-0 jid=1899 chk MediaType device=lto2 request=lto2
badger-sd: reserve.c:1184-0 try reserve Drive0
badger-sd: reserve.c:1198-0 jid=1899 have_vol=0 vol=
badger-sd: reserve.c:1381-0 jid=1899 reserve_append device is "Drive0"
(/dev/nst1)
badger-sd: reserve.c:1459-0 jid=1899 PrefMnt=0 exact=0 suitable=1
chgronly=1 any=0
badger-sd: reserve.c:1430-0 MaxJobs=0 Jobs=0 reserves=0 Status= Vol=
badger-sd: reserve.c:1542-0 jid=1899 OK Res Unused autochanger
"Drive0" (/dev/nst1).
badger-sd: reserve.c:481-0 Inc reserve=0 dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:1208-0 jid=1899 Reserved=1 dev_name=SL500
mediatype=lto2 pool=CorpWeekly ok=1
badger-sd: askdir.c:259-0 dir_find_next_appendable_
volume: reserved=1 Vol=
badger-sd: askdir.c:276-0 >dird CatReq
Job=lrsea-www001.2008-09-20_06.05.06 FindMedia=1 pool_name=CorpWeekly
media_type=lto2
badger-sd: askdir.c:182-0 <dird 1000 OK VolName=SL1110 VolJobs=10
VolFiles=264 VolBlocks=4031870 VolBytes=260104061952 VolMounts=6
VolErrors=0 VolWrites=9098970 MaxVolBytes=0 VolCapacityBytes=0
VolStatus=Append Slot=33 MaxVolJobs=0 MaxVolFiles=0 InChanger=1
VolReadTime=0 VolWriteTime=3828589585 EndFile=263 EndBlock=15499
VolParts=0 LabelType=0 MediaId=17
badger-sd: askdir.c:205-0 do_reqest_vol_info return true slot=33 Volume=SL1110
badger-sd: reserve.c:472-0 jid=1899 find_vol=SL1110 found=1
badger-sd: reserve.c:189-0 jid=1899 List find_volume: SL1104 in_use=1
on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List find_volume: SL1110 in_use=1
on device "Drive0" (/dev/nst1)
badger-sd: reserve.c:641-0 jid=1899 Vol=SL1110 on same dev.
badger-sd: askdir.c:285-0 Call reserve_volume. Vol=SL1110
badger-sd: reserve.c:316-0 jid=1899 enter reserve_volume=SL1110
drive="Drive0" (/dev/nst1)
badger-sd: reserve.c:189-0 jid=1899 List begin reserve_volume: SL1104
in_use=1 on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List begin reserve_volume: SL1110
in_use=1 on device "Drive0" (/dev/nst1)
badger-sd: reserve.c:331-0 jid=1899 Vol attached=SL1110, newvol=SL1110
volinuse=1 on "Drive0" (/dev/nst1)
badger-sd: reserve.c:339-0 jid=1899 === set reserved vol=SL1110
dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:418-0 jid=1899 === set in_use. vol=SL1110
dev="Drive0" (/dev/nst1)
badger-sd: reserve.c:189-0 jid=1899 List end new volume: SL1104
in_use=1 on device "Drive1" (/dev/nst0)
badger-sd: reserve.c:189-0 jid=1899 List end new volume: SL1110
in_use=1 on device "Drive0" (/dev/nst1)
badger-sd: askdir.c:292-0 dir_find_next_appendable_volume return true.
vol=SL1110
badger-sd: reserve.c:1220-0 jid=1899 looking for Volume=SL1110
badger-sd: reserve.c:1279-0 jid=1899 >dird changer: 3000 OK use device
device=Drive0
badger-sd: reserve.c:1107-0 jid=1899 Device Drive0 reserved=1 for append.
badger-sd: reserve.c:1053-0 jid=1899 available device found=SL500
badger-sd: reserve.c:1068-0 OK dev found. Vol=SL1110
badger-sd: dircmd.c:217-0 <dird: run






STORAGE DAEMON CONFIG
<snip>

Autochanger {
 Name            = SL500
 Device          = Drive0
 Device          = Drive1
 Changer Command = "/etc/bacula/scripts/mtx-changer %c %o %S %a %d"
 Changer Device  =  /dev/sg0
}

Device {
 Name           = Drive0
 Drive Index    = 0
 Media Type     = lto2
 Archive Device = /dev/nst1
 AutomaticMount = yes;
 AlwaysOpen     = yes;
 RemovableMedia = yes;
 RandomAccess   = no;
 AutoChanger    = yes
 Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
}

Device {
 Name           = Drive1
 Drive Index    = 1
 Media Type     = lto2
 Archive Device = /dev/nst0
 AutomaticMount = yes;
 AlwaysOpen     = yes;
 RemovableMedia = yes;
 RandomAccess   = no;
 AutoChanger    = yes
 Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
}
<snip>



DIRECTOR CONFIG
<snip>

Storage {
 Name        = SL500
 Address     = backup.corp.uievolution.com
 SDPort      = 9103
 Password    = "xxx"
 Device      = SL500
 Autochanger = yes
 Media Type  = lto2
 Maximum Concurrent Jobs = 2
}

<snip>



OPERATOR EMAIL
20-Sep 10:51 badger-sd JobId 1896: Please mount Volume "SL1110" or
label a new one for:
- Hide quoted text -
 Job:          cwseapfs001.2008-09-20_06.05.
03
 Storage:      "Drive0" (/dev/nst1)
 Pool:         CorpWeekly
 Media type:   lto2



--
Robb Wagoner
Datacenter Experts
Seattle, WA
(206) 898-8342
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users