Bacula-users

Re: [Bacula-users] [Bacula-devel] Proplem upgrading from 1.38 to 2.2

2008-05-18 17:11:39
Subject: Re: [Bacula-users] [Bacula-devel] Proplem upgrading from 1.38 to 2.2
From: Eric Bollengier <eric AT eb.homelinux DOT org>
To: bacula-devel AT lists.sourceforge DOT net
Date: Sun, 18 May 2008 23:11:22 +0200
Hello,

On Saturday 17 May 2008 16:02:02 Jari Fredriksson wrote:
> > If you either send me an attachment with your "comments"
> > or point me to an exact link to the thread, we'll at
> > least take a quick look at it for anything that might
> > stand out.  Sorry, I no longer have the time to search
> > for such things :-(   I think Eric might be interested in
> > this too since batch insert is mostly his baby :-)
>
> --------- sql_create.c -------------------------------------------------
>
> bool my_batch_start(JCR *jcr, B_DB *mdb)
> {
> bool ok;
> db_lock(mdb);
> ok = db_sql_query(mdb,
> "CREATE TEMPORARY TABLE batch ("
> "FileIndex integer,"
> "JobId integer,"
> "Path blob,"
> "Name blob,"
> "LStat tinyblob,"
> "MD5 tinyblob)",NULL, NULL);
> db_unlock(mdb);
> return ok;
> }
>
> ------------------------------------------------------------------------
>
> Path and Name are blobs. I'm not very good with MySQL (that means, there
> are younger guys who have tested everything and are great in what they are
> doing), but from my experience from mostly Oracle world is that a BLOB is
> not good at searching and indexing. Sometimes they are implemented as
> external files to the database, and are used to just store *stuff*.

Mysql and Oracle are completely different, but both have gotcha with text 
fields.

This is the original Filename.Name and Path.Path type. See make_mysql_tables. 
I don't remember why (something like a VCHAR can't have more than 255 char 
and strips automatically all spaces at the end), but you have to read 
carefully the documentation before choosing a between TEXT, VARCHAR and BLOB 
type.

http://dev.mysql.com/doc/refman/5.0/en/blob.html

> Still, bacula generates the following SQL:
>
>     INSERT INTO Filename (Name) SELECT a.Name FROM (SELECT DISTINCT Name
> FROM batch) AS a WHERE NOT EXISTS (SELECT Name FROM Filename AS f WHERE
> f.Name = a.Name)
>
> That looks suspicious to me.

Not for me, but if you can explain what is suspicious, i would appreciate it.
The main goal is to insert all new filename to Filename table.

>
> ------------------------------------------------------------------------
>
> Another suspiciously looking thing is in the same SQL query:
>
>     SELECT a.Name FROM (SELECT DISTINCT Name FROM batch) AS a
>
> Is actually (to me) same as
>
>     SELECT DISTINCT a.Name FROM batch AS a

Yes, but don't forget the rest of the query

 SELECT a.Name FROM (SELECT DISTINCT Name FROM batch) AS a
 WHERE NOT EXISTS 
        (SELECT Name FROM Filename AS f WHERE f.Name = a.Name)

> ------------------------------------------------------------------------
>
> Those things look suspicious to me. I did not try EXPLAIN PLAN for the
> queries, but there are 2 facts.

<troll> If you can read and understand the mysql explain output...  </troll>

> 1) My system hangs on a 100% processor time by MySQL, and the query is as
> shown up there

It depends on your configuration, the number of files etc... You have to check
if indexes are up to date, if your mysql setup is ok with your backups etc..

> 2) My commmon sense with 20 years of SQL and C/C++ says says 
> that the sql clause is suspicious. Double full table query for 1 exact
> thing.

For Path and Filename ? Yes we have to full scan batch twice, but you
can also revert to the previous behavior that works quite well with mysql
MyISAM tables.

If you have a SQL tricks (mysql/postgresql/sqlite) to improve batch mode, it 
will be really cool. We had got something like x20 speedup with theses 
suspicous queries with Postgresql/sqlite3/Mysql Innodb and if we can do it 
again it will be great.

Bye


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users