Lately,
I’ve been going though our file server looking for disk space to reclaim,
and I’ve come across 14GB worth of data in the Postgres DB, used only by
Bacula. Reading through the Bacula manual, I see that each file record is
supposed to take up 154 bytes in the DB, so I have gone through the logs to see
how many records should be saved, keying on “FD Files Written”.
Our rotation:
Level=Full Pool=Weekly sun at 2:00
Level=Differential Pool=Daily FullPool=Weekly mon-fri at
2:00
File
and Job Retention is set to 5 weeks in the Client directives, Volume Retention is
set to 5 weeks in the Pool directives. AutoPrune is set to Yes in both
places. The exception is the Monthly pool, used by 2 clients (small):
Level=Full Pool=Monthly 1st sun at 1:00
Level=Full Pool=Weekly 2nd-4th sun at
1:00
Level=Differential Pool=Daily FullPool=Weekly mon-fri at
2:00
For
the Monthly pool, File Retention is set to 90d, and Job Retention is set to 1y
in the Client directives, and Volume Retention is set to 1y in the pool.
AutoPrune is set to Yes in the Client Directives, but had been set to No in the
Pool, which was a red flag. I changed it to Yes, and restarted Bacula.
The
numbers: In the last 5 weeks, the sum of the backed up files to the
Weekly and Daily pools is 3,588,224, and of the other two, 1 client backs up 2
files to the Monthly pool a month, and the other, consistently just under 6,640
files. So, if my catalog should have a total of 3,667,928 files in
it (3,588,224 + (2*12) + (6,640 * 12)) * 154 bytes per file, the DB should be
564,860,912 bytes or 538MB? Made me either wonder where my math went
wrong, or why my DB is so big.
So,
after running a full vacuum on the Postgres DB and only reclaiming 1GB of disk
space, I started poking around in the DB itself (just looking). The most
interesting thing I have found so far is a table named ‘job’, which
seems to hold all the job records. And they go back to 2006….
Example:
bacula=#
select * from job where starttime like '2006%';
jobid
|
job
| name | type
| level | clientid |
jobstatus
| schedtime
| starttime
|
endtime | jobtdate
|
volsessionid | volsessiontime | jobfiles | jobbytes | joberrors |
jobmissingfiles | poo
lid
| filesetid | purgedfiles | hasbase
-------+-------------------------------------+-----------------+------+-------+----------+
-----------+---------------------+---------------------+---------------------+------------
+--------------+----------------+----------+-----------+-----------+-----------------+----
----+-----------+-------------+---------
272 | Mail_Data.2006-01-26_02.00.06 | Mail
Data | B |
D |
|
f
| 2006-01-26 02:00:05 | 2006-01-26 02:08:49 | 2006-01-26 02:37:33 | 1138261053
|
0
|
0 | 0
| 0 |
0
|
0 |
|
| 0
| 0
I
thought I would then be clever, and in Bacula use the prune command for the old
MailData job, one of the ones set to 5 weeks, but it didn’t show up in
the list because it was no longer a defined job resource; we don’t run
that particular job anymore, and haven’t for months: thus there should be
no jobs for ‘MailData’ in the catalog at all. Luckily, it was
just commented out in bacula-dir.conf, so I uncommented it and restarted
bacula. It then showed MailData, and said: “Pruned 15 Jobs
for client MailData from catalog.”. Encouraged, I exited Bacula and
checked the DB. They are all still there. Rinse and repeat. This
time, Bacula says: “No Jobs found to prune.”, so maybe it did
get rid of a few of them after all. But the rest are sitting there
derelict, just hogging space.
So,
why isn’t my DB pruning itself?
--Jeremy