Bacula-users

[Bacula-users] slow despool to tape speed?

2012-04-10 19:50:37
Subject: [Bacula-users] slow despool to tape speed?
From: "Steve Costaras" <stevecs AT chaven DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 10 Apr 2012 23:36:00 +0000
I'm running bacula 5.2.6 under ubuntu 10.04LTS this is a pretty simple setup 
just backing up the same server that bacula is on as it's the main fileserver.

For some background:  The main fileserver array is comprised of 96 2TB drives 
in a raid-60 (16 raidz2 vdevs of 6 drives per group)).   I have a seperate 
spool directory comprised of 16 2TB drives in a raid-10.  All 112 drives are 
spread across 7 sas HBA's.   Bacula's database (running sqlite v3.6.22) is on a 
SSD mirror.  This is going to an LTO4 tape system.    The base hardware of this 
system is a dual cpu X5680 box with 48GB of ram.

doing cp, dd, and other tests to the spool directory is fine (getting over 
1GB/s), likewise when bacula is running a full backup (10 jobs different 
directories in parrallel) I'm able to write to the spool directory (vmstat) at 
well over 700MB/s.    btape and dd testing to the tape drive from the spool 
directory seems to be running fine at 114MB/s which is good.

Now when bacula itself does the writing to the tape (normal backup process) I 
am only getting about 50-80MB/s which is pretty bad.   

I'm trying to narrow down what may be the slowdown here but am not familiar 
with the internals that well.   My current thinking would be something in the 
line of:  

  - bacula sd is not doing enough pre-fetching of the spool file?

  - perhaps when the spool data is being written to tape it needs to update the 
catalogue at that point? (the database is on a ssd mirror (Crucial C300 series 
50% overprovisioned) I do see some write bursts here (200 w/sec range) 
utilization is still well below 1% (iostat -x))



Doing other testing while bacula is de-spooling to the LTO4 tape I did a DD 
from the same file going to another array (copy from the spool directory to 
another un-related drive pool to help gauge available i/o bandwidth and that 
was well into the 500MB/s range).   So this really looks to be something with 
the despooling process itself.   Was looking to see if anyone else has run into 
similar issues?









------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users