Bacula-users

Re: [Bacula-users] Tuning for large (millions of files) backups?

2010-11-11 06:33:19
Subject: Re: [Bacula-users] Tuning for large (millions of files) backups?
From: Henrik Johansen <henrik AT scannet DOT dk>
To: Alan Brown <ajb2 AT mssl.ucl.ac DOT uk>
Date: Thu, 11 Nov 2010 12:29:44 +0100
'Alan Brown' wrote:
>Gavin McCullagh wrote:
>> On Tue, 09 Nov 2010, Alan Brown wrote:
>>
>>>> and it still takes 14 minutes to build the tree on one of our bigger 
>>>> clients.
>>>> We have 51 million entries in the file table.
>>>>
>>> Add individual indexes for Fileid,  Jobid  and Pathid
>>>
>>> Postgres will work with the combined index for individual table queries,
>>> but mysql won't.
>>
>> The following are the indexes on the file table:
>>
>> mysql> SHOW INDEXES FROM File;
>> +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
>> | Table | Non_unique | Key_name     | Seq_in_index | Column_name | Collation 
>> | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
>> +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
>> | File  |          0 | PRIMARY      |            1 | FileId      | A         
>> |    55861148 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | PathId       |            1 | PathId      | A         
>> |      735015 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | FilenameId   |            1 | FilenameId  | A         
>> |     2539143 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | FilenameId   |            2 | PathId      | A         
>> |    13965287 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | JobId        |            1 | JobId       | A         
>> |        1324 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | JobId        |            2 | PathId      | A         
>> |     2940060 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | JobId        |            3 | FilenameId  | A         
>> |    55861148 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | jobid_index  |            1 | JobId       | A         
>> |        1324 |     NULL | NULL   |      | BTREE      |         |
>> | File  |          1 | pathid_index |            1 | PathId      | A         
>> |      735015 |     NULL | NULL   |      | BTREE      |         |
>> +-------+------------+--------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
>>
>> I added the last two per your instructions.  Building the tree took about 14
>> minutes without these indexes and takes about 17-18 minutes having added
>> them.
>
>What tuning (if any) have you performed on your my.cnf and how much
>memory do you have?
>
>> Have I done something wrong?  As FileId is a primary key, it doesn't seem
>> like I should need an extra index on that one -- is that wrong?
>
>It doesn't need an extra index.
>
>You've also got a duplicate pathid indeax which can be deleted.
>
>This kind of thing is why it makes more sense to switch to postgres when
>  mysql databases get large.

I have had about as much of this as I can take now so please, stop spreading
FUD about MySQL.

When it comes to Bacula there is only one valid concern - Postgres has
certain statement constructs which allow certain queries to be performed
faster - that's about it.

It am not buying the postulation that postgres is "largely self-tuning",
especially not when dealing with large datasets. 

If you prefer postgres, that's totally fine but please stop telling
people that MySQL is unusable for large DB deployments because this
simply is untrue.

>
>
>------------------------------------------------------------------------------
>Centralized Desktop Delivery: Dell and VMware Reference Architecture
>Simplifying enterprise desktop deployment and management using
>Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
>client virtualization framework. Read more!
>http://p.sf.net/sfu/dell-eql-dev2dev
>_______________________________________________
>Bacula-users mailing list
>Bacula-users AT lists.sourceforge DOT net
>https://lists.sourceforge.net/lists/listinfo/bacula-users

-- 
Med venlig hilsen / Best Regards

Henrik Johansen
henrik AT scannet DOT dk
Tlf. 75 53 35 00

ScanNet Group
A/S ScanNet 

------------------------------------------------------------------------------
Centralized Desktop Delivery: Dell and VMware Reference Architecture
Simplifying enterprise desktop deployment and management using
Dell EqualLogic storage and VMware View: A highly scalable, end-to-end
client virtualization framework. Read more!
http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>