[Bacula-users] Catalog too big / not pruning?

Lately, I’ve been going though our file server looking for disk space to reclaim, and I’ve come across 14GB worth of data in the Postgres DB, used only by Bacula. Reading through the Bacula manual, I see that each file record is supposed to take up 154 bytes in the DB, so I have gone through the logs to see how many records should be saved, keying on “FD Files Written”. Our rotation:

Level=Full Pool=Weekly sun at 2:00

Level=Differential Pool=Daily FullPool=Weekly mon-fri at 2:00

File and Job Retention is set to 5 weeks in the Client directives, Volume Retention is set to 5 weeks in the Pool directives. AutoPrune is set to Yes in both places. The exception is the Monthly pool, used by 2 clients (small):

Level=Full Pool=Monthly 1^st sun at 1:00

Level=Full Pool=Weekly 2^nd-4^th sun at 1:00

Level=Differential Pool=Daily FullPool=Weekly mon-fri at 2:00

For the Monthly pool, File Retention is set to 90d, and Job Retention is set to 1y in the Client directives, and Volume Retention is set to 1y in the pool. AutoPrune is set to Yes in the Client Directives, but had been set to No in the Pool, which was a red flag. I changed it to Yes, and restarted Bacula.

The numbers: In the last 5 weeks, the sum of the backed up files to the Weekly and Daily pools is 3,588,224, and of the other two, 1 client backs up 2 files to the Monthly pool a month, and the other, consistently just under 6,640 files. So, if my catalog should have a total of 3,667,928 files in it (3,588,224 + (2*12) + (6,640 * 12)) * 154 bytes per file, the DB should be 564,860,912 bytes or 538MB? Made me either wonder where my math went wrong, or why my DB is so big.

So, after running a full vacuum on the Postgres DB and only reclaiming 1GB of disk space, I started poking around in the DB itself (just looking). The most interesting thing I have found so far is a table named ‘job’, which seems to hold all the job records. And they go back to 2006…. Example:

bacula=# select * from job where starttime like '2006%';

jobstatus | schedtime | starttime | endtime | jobtdate

lid | filesetid | purgedfiles | hasbase

-------+-------------------------------------+-----------------+------+-------+----------+

-----------+---------------------+---------------------+---------------------+------------

+--------------+----------------+----------+-----------+-----------+-----------------+----

----+-----------+-------------+---------

272 | Mail_Data.2006-01-26_02.00.06 | Mail Data | B | D | |

f | 2006-01-26 02:00:05 | 2006-01-26 02:08:49 | 2006-01-26 02:37:33 | 1138261053

| 0 | 0 | 0 | 0 | 0 | 0 |

| | 0 | 0

I thought I would then be clever, and in Bacula use the prune command for the old MailData job, one of the ones set to 5 weeks, but it didn’t show up in the list because it was no longer a defined job resource; we don’t run that particular job anymore, and haven’t for months: thus there should be no jobs for ‘MailData’ in the catalog at all. Luckily, it was just commented out in bacula-dir.conf, so I uncommented it and restarted bacula. It then showed MailData, and said: “Pruned 15 Jobs for client MailData from catalog.”. Encouraged, I exited Bacula and checked the DB. They are all still there. Rinse and repeat. This time, Bacula says: “No Jobs found to prune.”, so maybe it did get rid of a few of them after all. But the rest are sitting there derelict, just hogging space.

So, why isn’t my DB pruning itself?

--Jeremy

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july

_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users