Bacula-users

Re: [Bacula-users] Database Size

2017-03-16 16:39:39
Subject: Re: [Bacula-users] Database Size
From: James Chamberlain <jamesc AT exa DOT com>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Thu, 16 Mar 2017 16:38:32 -0400
Hi Kern,

If 100 is a large number of jobs, I have a relatively small number (23).  I don’t have any “former” clients that I’m not backing up anymore.  One thing I *do* know is that I have an absolute ton of tiny little files, though I’m pretty sure that most of them stick around.  According to my statistics here, Bacula has 1,284,029,677 files in its catalog.  I can probably afford a little downtime on my database, so I may take that option if I get above 95% utilization on the file system.

If my retention periods are quite long, do you have any recommendations on what would be more typical values?

Thanks,

James


On Mar 16, 2017, at 10:31 AM, Kern Sibbald <kern AT sibbald DOT com> wrote:

Hello,

I recently took a look at my catalog a bit more in detail when an upgrade of my backup server from 14.04 to 16.04 failed (I have 6 systems where the upgrade totally failed and left me with a broken system), and so I reloaded the Bacula catalog from scratch and in doing so I realized that there were lots and lots of old records in it from jobs that I had run several years ago.  This happens when you create a job or a client, then stop using that job or client (or even remove the client) so that no more jobs for that client run.  What is important is that Bacula prunes only when a job runs unless you do it manually, and if jobs never run, the retention periods never apply and you end up with lots and lots of unused (orphaned) records in the catalog.  The only way to clean it up is to see what jobs exist in the database and prune/purge those which are no longer used -- this is done manual with bconsole.

The same happens if you have lots and lots of temporary files that get backed up.  There are mail programs that create a temporary file for each email, then delete it a day or two later.  If these files are backed up even once, they will create lots of name entries in the database.  This can be cleaned up by using dbcheck.

Finally, if you can afford a bit of downtime on your database, first back it up, then delete it and recreate it with the backup.  This creates a database that is nicely compacted.  If you regularly run vacuums this is probably not necessary, but in extreme cases such as after deleting hundreds of old backup jobs or clients, it can be a quick way to compact the database.

Note also, your retention periods are quite long so if you have lots of jobs (more than 100) that run every night, you will need a big database. 

Best regards,

Kern


On 03/16/2017 03:17 PM, James Chamberlain wrote:
On Mar 16, 2017, at 3:29 AM, Mikhail Krasnobaev <milo1 AT ya DOT ru> wrote:
 
15.03.2017, 19:57, "James Chamberlain" <jamesc AT exa DOT com>:

Hi all,

I’m getting a touch concerned about the size of my Bacula database, and was wondering what I can do to prune it, compress it, or otherwise keep it at a manageable size. The database itself currently stands at 324 GB, and is using 90% of the file system it’s on. I’m running Bacula 7.4.0 on CentOS 6.8, with PostgreSQL 8.4.20 as the database. My file and job retention times are set to 180 days, and my volume retention time is set to 365 days. Is there any other information I can share which would help you help me track this down?

Thanks,

James

Good day,

do you run any maintenance jobs on the database?

Like:
--------------
[root@1c83centos ~]# cat /etc/crontab
SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/
 
# dump all databases once every 24 hours
45 4 * * * root nice -n 19 su - postgres -c "pg_dumpall --clean" | gzip -9 > /home/pgbackup/postgres_all.sql.gz
 
# vacuum all databases every night (full vacuum on Sunday night, lazy vacuum every other night)
45 3 * * 0 root nice -n 19 su - postgres -c "vacuumdb --all --full --analyze"
45 3 * * 1-6 root nice -n 19 su - postgres -c "vacuumdb --all --analyze --quiet"
 
# re-index all databases once a week
0 3 * * 0 root nice -n 19 su - postgres -c 'psql -t -c "select datname from pg_database order by datname;" | xargs -n 1 -I"{}" -- psql -U postgres {} -c "reindex database {};"'
-----------------
vacuumdb is a utility for cleaning a PostgreSQL database. vacuumdb will also generate internal statistics used by the PostgreSQL query optimizer.

I don’t believe I’ve been doing any of this.  I’ll read up on the documentation and see about putting these into place.

Thanks!

James


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>