Bacula-users

[Bacula-users] Exceeding realistic expectations for accurate mode?

2014-02-19 13:26:46
Subject: [Bacula-users] Exceeding realistic expectations for accurate mode?
From: rfox AT mbl DOT edu
To: Bacula Users Mailing List <bacula-users AT lists.sourceforge DOT net>
Date: Wed, 19 Feb 2014 12:57:15 -0500 (EST)
Hi,

I'm using accurate mode on my system and the mysql queries crawl along 
causing the whole backup to run very slowly.

I have 25 million files on a filesystem and I'm expected to be able to 
restore any file that existed on the filesystem on any day for a period of 
1.5 years.

Based on the performance of previous backup systems run on this server, I 
split the filesystem up into four groups and created bacula jobs for each 
group. Each group has its own dedicated logical library but all four jobs 
use the same MySQL database. I've been using Bacula in accurate mode for 
this system since Dec 20. While it was delightfully speedy at first over 
time it has become quite slow. The MySQL queries used during the run seem 
to be the bottleneck and I am worried the problem is insurmountable.

I have been going through rounds of mysql tuning to try to increase 
performance but it's a powerful computer with 24 cores (2 Xeon E5645s @ 
2.4GHz), mysql is using over 32G RAM, the database is on a SSD and MySQL 
tmp is on a pair of striped SSDs. The query that hangs everything up 
during the process takes more than 24 hours to get MySQL to 'explain' and 
jobs take about 1.7 days to run at this point. (A job is running now and 
the same query has been executing for over 62000 seconds.) I am using 
MySQL 5.5 though, so I may be able to benefit from a more recent version.

I've been running the MySQL sql-bench benchmarks on a number of different 
servers I have available and this server seems to be quite fast matching 
performance we get from our dedicated database servers with high speed 
enterprise disk arrays and 128G RAM. (Except when dropping tables, the 
dedicated systems are much faster than this server, but I don't believe 
that is related to the slow query problem I am seeing.)

I would chalk it up to a simple overestimation on my part of the ability 
of accurate mode on a system this size but in my experience I have run 
across quite a few statements about big bacula implementations that have 
billions of rows in the File table. Mine currently only has 110 million or 
so. I can't help but wonder if these references to large tables are on 
systems that run in non-accurate mode and consequently don't perform the 
complex queries that I'm seeing.

I expected a accurate mode to be difficult to implement without incurring 
a performance hit but I wonder if I'm not exceeding realistic expectations 
of the file selection algorithm with this size filesystem.

Thanks,
Rich.


-- 
  Rich Fox
  Systems Administrator
  JBPC - Marine Biological Laboratory
  http://www.mbl.edu/jbpc
  508-289-7669 - mbl-at-richfox.org

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121054471&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users