Well, to answer my own question, here's the algorithm (use The Source, Luke!):
CLUE1: During the backup, after the previous tape filled, my log file says, "There are no more Jobs associated with Volume "EPW681L3". Marking it purged."
That text comes from ua_purge.c, which runs the following SQL command in the is_volume_purged() function:
SELECT 1 FROM JobMedia WHERE MediaId=%s LIMIT 1
The MediaId comes from an argument to the command. So somehow Bacula is looking over the volumes. How? And somehow it skipped over the volume I thought it would choose, EPW680L3. Why?
According to autoprune.c, the aforementioned message string comes from prune_volumes() in auto_prune.c. But before I dig into that, I need to back up. prune_volumes() was called by find_next_volume_for_append(), which does this:
* 1. Look for volume with "Append" status.
...The query is:
SELECT MediaId,VolumeName,VolJobs,VolFiles,VolBlocks,"
"VolBytes,VolMounts,VolErrors,VolWrites,MaxVolBytes,VolCapacityBytes,"
"MediaType,VolStatus,PoolId,VolRetention,VolUseDuration,MaxVolJobs,"
"MaxVolFiles,Recycle,Slot,FirstWritten,LastWritten,InChanger,"
"EndFile,EndBlock,VolParts,LabelType,LabelDate,StorageId,"
"Enabled,LocationId,RecycleCount,InitialWrite,"
"ScratchPoolId,RecyclePoolId,VolReadTime,VolWriteTime,ActionOnPurge "
"FROM Media WHERE PoolId=%s AND MediaType='%s' AND Enabled=1 "
"AND VolStatus='%s' AND InChanger=1 AND StorageId=%s
with
if (strcmp(mr->VolStatus, "Recycle") == 0 ||
strcmp(mr->VolStatus, "Purged") == 0) {
order = "AND Recycle=1 ORDER BY LastWritten ASC,MediaId"; /* take oldest that can be recycled */
} else {
order = sql_media_order_most_recently_written[db_get_type_index(mdb)]; /* take most recently written */
}
Now here's what happens if it does not find a volume with "Append" status:
if (!ok) {
* No volume found, apply algorithm
Here is the argorithm:
* 2. Try finding a recycled volume
if (!ok) {
* 3. Try recycling any purged volume
if (!ok) {
* 4. Try pruning Volumes
if (!ok && create) {
* 5. Try pulling a volume from the Scratch pool
* If we are using an Autochanger and have not found
* a volume, retry looking for any volume.
if (!ok && InChanger) {
InChanger = false;
continue; /* retry again accepting any volume */
}
So- there is an algorithm! Quite elaborate, at that.
Now back to prune_volumes(). If it cannot find a volume, it will prune one. prune_volumes() says,
* Prune at least one Volume in current Pool. This is called from
* catreq.c => next_vol.c when the Storage daemon is asking for another
* volume and no appendable volumes are available.
prune_volumes first looks in the Scratch pool somehow. Mine is empty so I'll skip right over that part.
Next, it performs a query on the database:
/*
* Get the List of all media ids in the current Pool or whose
* RecyclePoolId is the current pool or the scratch pool
*/
const char *select = "SELECT DISTINCT MediaId,LastWritten FROM Media WHERE "
"(PoolId=%s OR RecyclePoolId IN (%s)) AND MediaType='%s' %s"
"ORDER BY LastWritten ASC,MediaId";
the PoolId comes from "mr->PoolId", where mr is of type MEDIA_DBR which is a class that is a reflection of a row in the Media table of the databes.
"ua->jcr" which I presume is a "Job Control Record". The PoolId's can be found in the Pool table. In my case, my Pool name is "Tape" so my PoolId is 5. I don't have a RecyclePoolId but that seems pretty straightforward. In any event, in my case it's probably the same as my Pool id: 5. MediaType comes from mr->MediaType which again comes straight from the bacula-dir.conf file; my MediaType is 'LTO-3'. The last string (the fourth %s) is I believe either empty or "AND InChanger=1 AND StorageId IN (%s)", where StorageId comes from the Storage table. For me, I have a Storage record that specifies a name of LTO-3 so StorageId found in the database is simply 3.
So my backup is trying to write to a tape in my 'Tape' Pool, on my 'LTO-3' Storage device. Thus the full query should be:
*select = "SELECT DISTINCT MediaId,LastWritten FROM Media WHERE "
"(PoolId=5 OR RecyclePoolId IN (5)) AND MediaType='LTO-3' "
"ORDER BY LastWritten ASC,MediaId";
At this point, it goes through each volume and prunes them. When it finds a volume that's then "purged", it will use it. My volume EPW680L3 is not purged, so it doesn't use it.
How is "purged" defined? This is found in is_volume_purged() in ua_purge.c:
- If FirstWritten or LastWritten are 0, the volume is not purged (I believe "0" is the same as "1970-01-01 00:00:00")
- If the Volume Status is "Purged" then the volume is purged.
- If there are no Jobs associated with the MediaId in the JobMedia database table, then the volume is purged.
In my case, there are Jobs associated with the MediaId in the JobMedia database table, but this JobId does not exist in the Job table. Therefore I believe something happened- maybe I purged this job in the past, in my ignorance- and thus the tape cannot be pruned and thus will not be purged, so Bacula won't use it. I need to investigate a little more.