Bacula-users

Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?

2012-07-05 14:01:30
Subject: Re: [Bacula-users] bacula jobs use volumes from the wrong pool - bug?
From: Stephen Thompson <stephen AT seismo.berkeley DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 05 Jul 2012 10:59:22 -0700
On 07/05/2012 10:46 AM, Stephen Thompson wrote:
>
>
> Hello,
>
> Running 5.2.9, though I believe we've seen this sporadically in earlier
> versions.  Jobs are using volumes that are in pools to which they have
> not been assigned.
>
> This is likely a bug as I don't see anything peculiar about our
> configuration.  We are using a tape library with 2 drives, both set to
> autoselect. The library contains volumes that are properly assigned
> (i.e. database entries for volumes look fine) to various pools,
> including a Full pool and an Incremental pool.
>
> Twice in the past week, Full jobs which specify the use of the Full pool
> using jobdefs are using volumes from the Incremental pool.  I haven't
> narrowed down all the details, but I believe it's if the Incremental
> volume is already loaded in a drive when the Full job in question is
> launched.
>
>
> Example:
>
> 01-Jul 22:00 <DIR> JobId 244098: Fatal error: JobId 243957 already
> running. Duplicate job not allowed.
> 01-Jul 20:03 <DIR> JobId 243957: Start Backup JobId 243957,
> Job=<JOB>.2012-07-01_20.00.04_03
> 01-Jul 20:04 <DIR> JobId 243957: Using Device "SL500-Drive-0"
> 01-Jul 20:42 <SD> JobId 243957: Spooling data ...
> 02-Jul 14:01 <SD> JobId 243957: Job write elapsed time = 17:19:00,
> Transfer rate = 33.39 M Bytes/second
> 02-Jul 14:01 <SD> JobId 243957: Committing spooled data to Volume
> "FB0161". Despooling 2,082,572,994,002 bytes ...
> 02-Jul 15:38 <SD> JobId 243957: End of Volume "IM0094" at 462:1559 on
> device "SL500-Drive-0" (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
> 02-Jul 15:38 <SD> JobId 243957: Re-read of last block succeeded.
> 02-Jul 15:38 <SD> JobId 243957: End of medium on Volume "IM0094"
> Bytes=924,097,352,704 Blocks=3,525,155 at 02-Jul-2012 15:38.
> 02-Jul 15:38 <SD> JobId 243957: 3307 Issuing autochanger "unload slot
> 148, drive 0" command.
> 02-Jul 15:39 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 15:39 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 15:39 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 15:39 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 15:39 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 1" command.
> 02-Jul 15:39 <SD> JobId 243957: 3302 Autochanger "loaded? drive 1",
> result: nothing loaded.
> 02-Jul 15:39 <SD> JobId 243957: 3304 Issuing autochanger "load slot 6,
> drive 0" command.
> 02-Jul 15:40 <SD> JobId 243957: 3305 Autochanger "load slot 6, drive 0",
> status is OK.
> 02-Jul 15:40 <SD> JobId 243957: Volume "FB0161" previously written,
> moving to end of data.
> 02-Jul 15:40 <SD> JobId 243957: Ready to append to end of Volume
> "FB0161" at file=7.
> 02-Jul 15:40 <SD> JobId 243957: New volume "FB0161" mounted on device
> "SL500-Drive-0" (/dev/SL500-Drive-0) at 02-Jul-2012 15:40.
> 02-Jul 18:19 <SD> JobId 243957: End of Volume "FB0161" at 274:5411 on
> device "SL500-Drive-0" (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
> 02-Jul 18:19 <SD> JobId 243957: Re-read of last block succeeded.
> 02-Jul 18:19 <SD> JobId 243957: End of medium on Volume "FB0161"
> Bytes=535,390,861,312 Blocks=2,042,367 at 02-Jul-2012 18:19.
> 02-Jul 18:19 <SD> JobId 243957: 3307 Issuing autochanger "unload slot 6,
> drive 0" command.
> 02-Jul 18:20 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 18:20 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 18:20 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 18:20 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 18:20 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 1" command.
> 02-Jul 18:20 <SD> JobId 243957: 3302 Autochanger "loaded? drive 1",
> result: nothing loaded.
> 02-Jul 18:20 <SD> JobId 243957: 3304 Issuing autochanger "load slot 4,
> drive 0" command.
> 02-Jul 18:21 <SD> JobId 243957: 3305 Autochanger "load slot 4, drive 0",
> status is OK.
> 02-Jul 18:21 <SD> JobId 243957: Volume "FB0158" previously written,
> moving to end of data.
> 02-Jul 18:21 <SD> JobId 243957: Ready to append to end of Volume
> "FB0158" at file=1.
> 02-Jul 18:21 <SD> JobId 243957: New volume "FB0158" mounted on device
> "SL500-Drive-0" (/dev/SL500-Drive-0) at 02-Jul-2012 18:21.
> 02-Jul 21:14 <SD> JobId 243957: End of Volume "FB0158" at 274:1785 on
> device "SL500-Drive-0" (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
> 02-Jul 21:14 <SD> JobId 243957: Re-read of last block succeeded.
> 02-Jul 21:14 <SD> JobId 243957: End of medium on Volume "FB0158"
> Bytes=546,439,698,432 Blocks=2,084,507 at 02-Jul-2012 21:14.

> 02-Jul 21:14 <SD> JobId 243957: 3307 Issui02-Jul 22:00 <DIR> JobId
> 244271: Fatal error: JobId 244133 already running. Duplicate job not
> allowed.

This line is btw as it is in the log file.  Looks a bit mangled, like 
two lines combined in one line "02-Jul 21:14 <SD> JobId 243957: 3307 
Issui" and "02-Jul 22:00 <DIR> JobId 244271: Fatal error: JobId 244133 
already running. Duplicate job not allowed."


> 02-Jul 22:00 <DIR> JobId 244268: Fatal error: JobId 243957 already
> running. Duplicate job not allowed.
> 02-Jul 21:16 <DIR> JobId 243957: Recycled volume "FB0162"
> 02-Jul 21:16 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 21:16 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 21:16 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 02-Jul 21:16 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 02-Jul 21:16 <SD> JobId 243957: 3304 Issuing autochanger "load slot 7,
> drive 0" command.
> 02-Jul 21:16 <SD> JobId 243957: 3305 Autochanger "load slot 7, drive 0",
> status is OK.
> 02-Jul 21:16 <SD> JobId 243957: Recycled volume "FB0162" on device
> "SL500-Drive-0" (/dev/SL500-Drive-0), all previous data lost.
> 02-Jul 21:16 <SD> JobId 243957: New volume "FB0162" mounted on device
> "SL500-Drive-0" (/dev/SL500-Drive-0) at 02-Jul-2012 21:16.
> 03-Jul 00:36 <SD> JobId 243957: End of Volume "FB0162" at 295:4268 on
> device "SL500-Drive-0" (/dev/SL500-Drive-0). Write of 262144 bytes got -1.
> 03-Jul 00:37 <SD> JobId 243957: Re-read of last block succeeded.
> 03-Jul 00:37 <SD> JobId 243957: End of medium on Volume "FB0162"
> Bytes=591,088,321,536 Blocks=2,254,823 at 03-Jul-2012 00:37.
> 03-Jul 00:37 <SD> JobId 243957: 3307 Issuing autochanger "unload slot 7,
> drive 0" command.
> 03-Jul 00:38 <DIR> JobId 243957: Recycled volume "FB0164"
> 03-Jul 00:38 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 03-Jul 00:38 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 03-Jul 00:38 <SD> JobId 243957: 3301 Issuing autochanger "loaded? drive
> 0" command.
> 03-Jul 00:38 <SD> JobId 243957: 3302 Autochanger "loaded? drive 0",
> result: nothing loaded.
> 03-Jul 00:38 <SD> JobId 243957: 3304 Issuing autochanger "load slot 9,
> drive 0" command.
> 03-Jul 00:39 <SD> JobId 243957: 3305 Autochanger "load slot 9, drive 0",
> status is OK.
> 03-Jul 00:39 <SD> JobId 243957: Recycled volume "FB0164" on device
> "SL500-Drive-0" (/dev/SL500-Drive-0), all previous data lost.
> 03-Jul 00:39 <SD> JobId 243957: New volume "FB0164" mounted on device
> "SL500-Drive-0" (/dev/SL500-Drive-0) at 03-Jul-2012 00:39.
> 03-Jul 01:08 <SD> JobId 243957: Despooling elapsed time = 10:58:19,
> Transfer rate = 52.72 M Bytes/second
> 03-Jul 01:08 <SD> JobId 243957: Sending spooled attrs to the Director.
> Despooling 2,500,634,015 bytes ...
> 03-Jul 01:31 <DIR> JobId 243957: Bacula <DIR> 5.2.9 (11Jun12):
>
>     Build OS:               x86_64-unknown-linux-gnu redhat Enterprise
> release
>     JobId:                  243957
>     Job:                    <JOB>.2012-07-01_20.00.04_03
>     Backup Level:           Full
>     Client:                 "<FD>" 5.2.6 (21Feb12)
> i386-pc-solaris2.10,solaris,5.10
>     FileSet:                "<FS>" 2012-02-01 20:00:23
>     Pool:                   "Full-Pool" (From Job resource)
>     Catalog:                "MyCatalog" (From Client resource)
>     Storage:                "SL500-changer" (From Job resource)
>     Scheduled time:         01-Jul-2012 20:00:04
>     Start time:             01-Jul-2012 20:04:02
>     End time:               03-Jul-2012 01:31:35
>     Elapsed time:           1 day 5 hours 27 mins 33 secs
>     Priority:               10
>     FD Files Written:       6,915,330
>     SD Files Written:       6,915,330
>     FD Bytes Written:       2,080,113,652,089 (2.080 TB)
>     SD Bytes Written:       2,081,613,855,190 (2.081 TB)
>     Rate:                   19613.9 KB/s
>     Software Compression:   None
>     VSS:                    no
>     Encryption:             no
>     Accurate:               yes
>     Volume name(s):         IM0094|FB0161|FB0158|FB0162|FB0164
>     Volume Session Id:      147
>     Volume Session Time:    1340291913
>     Last Volume Bytes:      96,966,779,904 (96.96 GB)
>     Non-fatal FD errors:    0
>     SD Errors:              0
>     FD termination status:  OK
>     SD termination status:  OK
>     Termination:            Backup OK
>
> 03-Jul 01:31 <DIR> JobId 243957: Begin pruning Jobs older than 10 years .
> 03-Jul 01:31 <DIR> JobId 243957: No Jobs found to prune.
> 03-Jul 01:31 <DIR> JobId 243957: Begin pruning Files.
> 03-Jul 01:31 <DIR> JobId 243957: No Files found to prune.
> 03-Jul 01:31 <DIR> JobId 243957: End auto prune
>
>
> Note: FB0161|FB0158|FB0162|FB0164 are all in Full-Pool, whereas the
> first tape used in the job, IM0094 is in Incremental-Pool.
>
>
> Anyone have ideas why it would be using a volume from a pool to which
> the job has not been associated?
>
> thanks,
> Stephen
>


-- 
Stephen Thompson               Berkeley Seismological Laboratory
stephen AT seismo.berkeley DOT edu    215 McCone Hall # 4760
404.538.7077 (phone)           University of California, Berkeley
510.643.5811 (fax)             Berkeley, CA 94720-4760



------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and 
threat landscape has changed and how IT managers can respond. Discussions 
will include endpoint security, mobile security and the latest in malware 
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users