Bacula-users

Re: [Bacula-users] removing indexes on File table

2011-08-02 05:51:28
Subject: Re: [Bacula-users] removing indexes on File table
From: Gavin McCullagh <gavin.mccullagh AT gcd DOT ie>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 2 Aug 2011 10:48:36 +0100
Hi,

On Tue, 02 Aug 2011, Annette Jaekel wrote:

> Thats amazing, because I searched for reasons of bad performance in the
> last days both for backup (18 hours for a full of 300 GB with 5 million
> files) 

It seems unlikely that adding an index would noticeably improve backup
performance, as that is mostly an INSERT/UPDATE heavy procedure (with the
possible exception of accurate backups) and indexes generally slow those
jobs down.

A full backup job is made up of a few different steps.  In the main, it's
"sending the data from FD to SD" followed by "adding all of the metadata to
the catalog".  If you can work out how long each part takes (maybe look at
the timestamp on the volume if it's a file volume), you may be able to
better see which is the slow bit.  Optimising the database won't help if
the time is all spent on sending data.

It might help to create a test job with a smaller fileset which takes a
short time but gives the same rate so you can test quickly.

It sounds like the rate you're getting overall is about 37Mbit/sec.  A
few things to check.

 - Disk access -- is there anything preventing the FD from serving the data
   faster than this.  Fragmented filesystem, slow disks, low on memory, ...
   An incremental or differential is a bit complex, but you say it's a full
   backup, so it should be able to pull the data reasonably quickly.

 - Check the CPU load, I/O load, memory usage and network bandwidth during
   a backup on the FD and SD.

 - Software compression or encryption -- if you have either on, the FD's
   cpu might become a bottleneck.  Try turning them off as an experiment.

 - Network bandwidth -- use iperf to test a TCP connection from FD->SD and
   make sure the network is capable of taking the speed you expect.  Is the
   entire network path 100Mbit/sec full duplex or better?  Is it busy?

> and in restores (building the corresponding file tree from my postgres
> database take more than 10 minutes). One of the first hint I found was to
> create these indicies you just remove. So I create them, but without any
> influence corresponding to the performance.
> 
> Can anyone tell me: Should these indicies (PathID, FilenameID) be
> advantageous for the database accesses or not? Or is this depending from
> database package (mysql, postgres)?

I wouldn't have expected that adding an index would ever slow down a query,
but it seems that with MySQL this isn't always the case and it's clear here
that the select query is faster with those indexes removed.  Bacula does
seem to be better optimised for Postgresql.

It would be worthwhile to check that your database has been tuned at least
away from the bare defaults.  In the case of MySQL, you can use this:

        http://mysqltuner.pl/mysqltuner.pl

the key_buffer seems to be one of the most important parameters.

There are lots of howtos around for Postgresql too.

Gavin



------------------------------------------------------------------------------
BlackBerry&reg; DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts. 
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos & much more. Register early & save!
http://p.sf.net/sfu/rim-blackberry-1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>