Bacula-users

Re: [Bacula-users] very slow virtualfull job

2013-04-29 04:55:57
Subject: Re: [Bacula-users] very slow virtualfull job
From: James Harper <james.harper AT bendigoit.com DOT au>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Mon, 29 Apr 2013 08:33:42 +0000
> Hello James,
> 
> It looks like you have found the most important things: turn on
> attribute spooling,

This actually made it worse. Instead of distributing the tiny little writes 
throughout the backup job, it saved them all to the end when nothing else was 
running. It would be nice if the next job could start while attributes were 
still spooling...

> and switch to InnoDB.

I switch to InnoDB last time this happened.

> Before switching to PostgreSQL, you might try running
> the MySQL tunning program mysqltune (I forgot the exact name).  It tells
> you
> items you should tune.  Often users do not give it enough memory, or even
> do the opposite which can cause problems, that is give it too much.
> 
> There are several scripts that switch from MySQL to PostgreSQL that work
> fine. They also allow you to keep your MySQL running until you get a good
> config for PostgreSQL.  Tunning PostgreSQL is much more complicated, but
> it gives *far* better results for big jobs.
> 

When loading the data into postgresql absolutely crawled along (~50kb/second 
disk write speed with 100% iowait) I knew I had a problem.

Something, somewhere has changed in my system that absolutely kills tiny sync 
writes. Or alternatively, something has changed in my system that makes mysql 
do tiny sync writes.

iostat showed ~50kb/second with 100% iowait while loading the catalogs, and 
nothing I did changed this. The following dd also behaved appallingly:

# dd if=/dev/zero of=test.bin bs=512 count=1024 oflag=sync
1024+0 records in
1024+0 records out
524288 bytes (524 kB) copied, 56.2482 s, 9.3 kB/s

While on an old system (~10 years old) with a single ATA/100 harddisk:

# dd if=/dev/zero of=test.bin bs=512 count=1024 oflag=dsync
1024+0 records in
1024+0 records out
524288 bytes (524 kB) copied, 1.10022 seconds, 477 kB/s

I'm working on trying to track down wtf is going on there, but in the meantime 
I have set innodb_flush_log_at_trx_commit=0 which means it won't run an fsync 
after each tiny little write but will instead wait for around a second then 
flush everything. This means I stand to lose 1 second of database commit in the 
event of a crash, but I also probably lose the whole backup job anyway so I 
don't see it as a loss.

Performance is now back to normal and I can take my time figuring out why this 
happened.

Thanks

James


------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_apr
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users