Bacula-users

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-12 04:43:36
Subject: Re: [Bacula-users] Tuning for large (millions of files) backups?
From: Mikael Fridh <frimik AT gmail DOT com>
To: Gavin McCullagh <gavin.mccullagh AT gcd DOT ie>
Date: Fri, 12 Nov 2010 10:40:40 +0100
On Thu, Nov 11, 2010 at 3:47 PM, Gavin McCullagh <gavin.mccullagh AT gcd DOT 
ie> wrote:
> On Mon, 08 Nov 2010, Gavin McCullagh wrote:
>
>> We seem to have the correct indexes on the file table.  I've run optimize 
>> table
>> and it still takes 14 minutes to build the tree on one of our bigger clients.
>> We have 51 million entries in the file table.
>
> I thought I should give some mroe concrete information:
>
> I don't suppose this is news to anyone but here's the mysql slow query log to
> correspond:

> # Time: 101111 14:24:49
> # User@Host: bacula[bacula] @ localhost []
> # Query_time: 1139.657646  Lock_time: 0.000471 Rows_sent: 4263403 
>  Rows_examined: 50351037
> SET timestamp=1289485489;
> SELECT Path.Path, Filename.Name, Temp.FileIndex, Temp.JobId, LStat, MD5 FROM 
> ( SELECT FileId, Job.JobId AS JobId, FileIndex, File.PathId AS PathId, 
> File.FilenameId AS FilenameId, LStat, MD5 FROM Job, File, ( SELECT 
> MAX(JobTDate) AS JobTDate, PathId, FilenameId FROM ( SELECT JobTDate, PathId, 
> FilenameId FROM File JOIN Job USING (JobId) WHERE File.JobId IN 
> (9944,9950,9973,9996) UNION ALL SELECT JobTDate, PathId, FilenameId FROM 
> BaseFiles JOIN File USING (FileId) JOIN Job  ON    (BaseJobId = Job.JobId) 
> WHERE BaseFiles.JobId IN (9944,9950,9973,9996) ) AS tmp GROUP BY PathId, 
> FilenameId ) AS T1 WHERE (Job.JobId IN ( SELECT DISTINCT BaseJobId FROM 
> BaseFiles WHERE JobId IN (9944,9950,9973,9996)) OR Job.JobId IN 
> (9944,9950,9973,9996)) AND T1.JobTDate = Job.JobTDate AND Job.JobId = 
> File.JobId AND T1.PathId = File.PathId AND T1.FilenameId = File.FilenameId ) 
> AS Temp JOIN Filename ON (Filename.FilenameId = Temp.FilenameId) JOIN Path ON 
> (Path.PathId = Temp.PathId) WHERE FileIndex > 0 ORDER BY Temp.JobId, 
> FileIndex ASC;

Could you please do an EXPLAIN on this query?
I know it's going to look awful but I'm curious anyway.
subqueries like these and SELECT DISTINCT are usually a recipe for
disastrous querytimes in MySQL.

> I've spent some time with the mysqltuner.pl script but to no avail thus far.
> There's 6GB RAM so it suggests a key buffer size of >4GB which I've set at
> 4.1GB.

Tuning's not going to get any of those 50 million traversed rows
disappear. Only a differently optimized query plan will.

> This is an Ubuntu Linux server running MySQL v5.1.41.  The mysql data is on an
> MD software RAID 1 array on 7200rpm SATA disks.  The tables are MyISAM (which 
> I
> had understood to be quicker than innodb in low concurrency situations?).  The
> tuner script is suggesting I should disable innodb as we're not using it which
> I will do though I wouldn't guess that will make a massive difference.

No, it will not help.

--
Mikael

------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>