Bacula-users

[Bacula-users] Disk based backup using vchanger, volumes being marked as Error

2014-08-01 12:56:12
Subject: [Bacula-users] Disk based backup using vchanger, volumes being marked as Error
From: Joseph Dickson <jdickson AT evolvetsi DOT com>
To: Bacula-users AT lists.sourceforge DOT net
Date: Fri, 1 Aug 2014 12:27:15 -0400
Greetings :-)

I've run into this problem with Bacula in a previous installation, and I can't seem to recall if there was ever a resolution..  I'm using Bacula for disk based backups only, and I am using vchanger to manage my virtual library.  

I've configured a vchanger library with 100 slots and 8 drives, and have set a Maximum Volume Bytes of 100G on the pool definition that I am using, to limit each slot in the library to 100G.  I have also set a Maximum Concurrent Jobs = 2 setting on each of the virtual tape drive devices in my storage director config, so that only two jobs can write to a device at a time to minimize interleaving.

Everything works perfectly as long as I only kick a few jobs off at a time.. however, when my main backup windows run and 30 or 40 backup jobs kick off, I often end up with jobs that output the following sequence in the logs:

31-Jul 21:00 bacula1-dir JobId 692: Start Backup JobId 692, Job=job-evolvereports-main.2014-07-31_21.00.00_48
31-Jul 21:00 bacula1-dir JobId 692: Using Device "chg1-drive-1" to write.
31-Jul 21:00 evolvereports-fd JobId 692: DIR and FD clocks differ by 50 seconds, FD automatically compensating.
31-Jul 21:05 bacula1-sd JobId 692: 3307 Issuing autochanger "unload slot 74, drive 1" command.
31-Jul 21:06 bacula1-sd JobId 692: Warning: Volume "chg1_0001_0066" wanted on "chg1-drive-1" (/var/lib/bacula/chg1/1/drive1) is in use by device "chg1-drive-3" (/var/lib/bacula/chg1/3/drive3)
31-Jul 21:06 bacula1-sd JobId 692: Warning: Volume "chg1_0001_0066" not on file device "chg1-drive-1" (/var/lib/bacula/chg1/1/drive1).
31-Jul 21:06 bacula1-sd JobId 692: Marking Volume "chg1_0001_0066" in Error in Catalog.
31-Jul 21:06 bacula1-sd JobId 692: Warning: Volume "chg1_0001_0066" not on file device "chg1-drive-1" (/var/lib/bacula/chg1/1/drive1).
31-Jul 21:06 bacula1-sd JobId 692: Marking Volume "chg1_0001_0066" in Error in Catalog.
31-Jul 21:06 bacula1-sd JobId 692: Warning: mount.c:212 Open of file device "chg1-drive-1" (/var/lib/bacula/chg1/1/drive1) Volume"chg1_0001_0066" failed: ERR=file_dev.c:172 Could not open(/var/lib/bacula/chg1/1/drive1,OPEN_READ_WRITE,0640): ERR=No such file or directory

31-Jul 21:06 bacula1-sd JobId 692: 3307 Issuing autochanger "unload slot 71, drive 2" command.
31-Jul 21:06 bacula1-sd JobId 692: 3304 Issuing autochanger "load slot 71, drive 1" command.
31-Jul 21:06 bacula1-sd JobId 692: 3305 Autochanger "load slot 71, drive 1", status is OK.
31-Jul 21:06 bacula1-sd JobId 692: Volume "chg1_0001_0071" previously written, moving to end of data.
31-Jul 21:06 bacula1-sd JobId 692: Ready to append to end of Volume "chg1_0001_0071" size=8,003,988,010

This ends up marking my perfectly usable volume as Error in the catalog.  Is this something that everyone runs into?  Is there any fix?  As I recall when I looked into it a few years back, the issue was the order and timing of volume and device selection, but it's definitely been a while.

My bacula-sd.conf file is here:

Any guidance would be appreciated!

Thanks,

Joe
------------------------------------------------------------------------------
Want fast and easy access to all the code in your enterprise? Index and
search up to 200,000 lines of code with a free copy of Black Duck
Code Sight - the same software that powers the world's largest code
search on Ohloh, the Black Duck Open Hub! Try it now.
http://p.sf.net/sfu/bds
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users