Bacula-users

[Bacula-users] getting rid of the base64 encoded lstat field - table size testing

2009-04-11 02:44:05
Subject: [Bacula-users] getting rid of the base64 encoded lstat field - table size testing
From: Craig Ringer <craig AT postnewspapers.com DOT au>
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 11 Apr 2009 14:39:04 +0800
Hi

I've been doing some testing to see what effects might be seen from
getting rid of the base64-encoded `lstat' field, instead storing the
values as native database fields.

As I already posted, there's no significant change in table size under
PostgreSQL. I needed to test this for the other bacula-supported
databases too, though.

Taking my `file' table from my working Bacula instance, I get the
following table sizes in the various supported database systems for the
original table with base64-encoded `lstat' field and the converted
version with expanded lstat fields:

DBMS            Size (b64)    Size (expanded)   Increase         %
PostgreSQL      1667596288    1670381568        2785280      +0.15
MySQL5-MyISAM   1335287664    1236507328        -98780336    -7.39
MySQL5-InnoDB   1720713216    1774190592        53477376     +3.10
SQLite3         1403427840    1246750720        -156677120  -11.16

So ... in table size terms, it's actually a win for SQLite and MyISAM.
There's no significant change for PostgreSQL, and a small loss for
InnoDB.  Details on testing methods used at end of post.

I'll follow up shortly with INSERT times in a later post.


I'm actually wondering, though, if Bacula shouldn't be using the various
database's bulk-insert facilities - multi-valued inserts or csv bulk
loading. For example, when using PostgreSQL you can use COPY to do
bulk-inserts:

PQexec(conn, "COPY foo FROM STDIN");
PQputline(conn, "3\thello world\t4.5\n");
PQputline(conn,"4\tgoodbye world\t7.11\n");
...
PQputline(conn,"\\.\n");
PQendcopy(conn);

It's probably possible to do something similar with libmysqlclient,
implementing LOAD DATA INFILE ... LOCAL to feed MySQL the data.


and in both MySQL and PostgreSQL you can use multi-valued inserts, so
for the table:

CREATE TABLE a (x integer);

you can insert (say) rows:

INSERT INTO a (x) values (1),(2),(3),(4),(5);


Unfortunately, multi-valued inserts are not supported by SQLite3.


**** NOTES ON TESTING ****

Pg table size obtained with "select pg_total_relation_size(file)".

MySQL table sizes obtained with "SHOW TABLE STATUS".

SQLite table size obtained with `ls -l' on the .sqlite file, which has
only one table.

The same schema definition was used for all tests; the only change was
appending "ENGINE=MyISAM" or "ENGINE=InnoDB" for the MySQL tests. The
schema was:

CREATE TABLE file (
    fileid bigint NOT NULL,
    fileindex integer DEFAULT 0 NOT NULL,
    jobid integer NOT NULL,
    pathid integer NOT NULL,
    filenameid integer NOT NULL,
    markid integer DEFAULT 0 NOT NULL,
    lstat text NOT NULL,
    md5 text NOT NULL
);

Indexes were omitted for the purposes of this test; there are no indexes
on any of the tables in any of the tested databases. Since none of the
fields we're changing the format of are indexed, they don't matter, and
getting rid of them makes dumps/loads a LOT quicker.

All tables were populated from the same tab-separated dump generated
with "\copy file to 'file.tsv'" in psql and loaded with "mysqlimport"
(MySQL) and ".mode tabs; .import" (SQLite3).

I converted the `lstat' field to native database types with the
following command in Pg:

SELECT fileid, fileindex, jobid, pathid, filenameid, markid,
       (decode_stat(lstat)).*,
       md5
INTO file2
FROM file;

( "decode_stat" being the C extension to PostgreSQL I posted here
earlier, that gives PostgreSQL the ability to decode bacula's
pseudo-base64-encoded lstat field ).


--
Craig Ringer

------------------------------------------------------------------------------
This SF.net email is sponsored by:
High Quality Requirements in a Collaborative Environment.
Download a free trial of Rational Requirements Composer Now!
http://p.sf.net/sfu/www-ibm-com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users