Bacula-users

Re: [Bacula-users] slow despool to tape speed?

2012-04-11 12:52:33
Subject: Re: [Bacula-users] slow despool to tape speed?
From: Steve Ellis <ellis AT brouhaha DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 11 Apr 2012 09:49:52 -0700
On 4/10/12 4:36 PM, Steve Costaras wrote:
> I'm running bacula 5.2.6 under ubuntu 10.04LTS this is a pretty simple setup 
> just backing up the same server that bacula is on as it's the main fileserver.
>
> For some background:  The main fileserver array is comprised of 96 2TB drives 
> in a raid-60 (16 raidz2 vdevs of 6 drives per group)).   I have a seperate 
> spool directory comprised of 16 2TB drives in a raid-10.  All 112 drives are 
> spread across 7 sas HBA's.   Bacula's database (running sqlite v3.6.22) is on 
> a SSD mirror.  This is going to an LTO4 tape system.    The base hardware of 
> this system is a dual cpu X5680 box with 48GB of ram.
>
> doing cp, dd, and other tests to the spool directory is fine (getting over 
> 1GB/s), likewise when bacula is running a full backup (10 jobs different 
> directories in parrallel) I'm able to write to the spool directory (vmstat) 
> at well over 700MB/s.    btape and dd testing to the tape drive from the 
> spool directory seems to be running fine at 114MB/s which is good.
>
> Now when bacula itself does the writing to the tape (normal backup process) I 
> am only getting about 50-80MB/s which is pretty bad.
>
> I'm trying to narrow down what may be the slowdown here but am not familiar 
> with the internals that well.   My current thinking would be something in the 
> line of:
>
>    - bacula sd is not doing enough pre-fetching of the spool file?
>
>    - perhaps when the spool data is being written to tape it needs to update 
> the catalogue at that point? (the database is on a ssd mirror (Crucial C300 
> series 50% overprovisioned) I do see some write bursts here (200 w/sec range) 
> utilization is still well below 1% (iostat -x))
Your hardware is _much_ better than mine (spool in non-RAID, no SSD, 
LTO3, 4GB RAM, etc), yet I get tape throughput not a lot lower.  At the 
despool rates you are seeing, I'm guessing you may be shoeshining the 
tape on LTO4.  Here's the results I saw from my most recent 3 full 
backup jobs (presumably tape despool for incrementals is similar, but 
mine are so small as to include way too much start/stop overhead):
Job1:
Committing spooled data to Volume "MO0039L3". Despooling 45,237,900,235 
bytes ...
Despooling elapsed time = 00:13:11, Transfer rate = 57.19 M Bytes/second

Job2:
Writing spooled data to Volume. Despooling 85,899,607,651 bytes ...
Despooling elapsed time = 00:27:47, Transfer rate = 51.52 M Bytes/second
Committing spooled data to Volume "MO0039L3". Despooling 21,965,154,476 
bytes ...
Despooling elapsed time = 00:09:19, Transfer rate = 39.29 M Bytes/second

Job3:
Committing spooled data to Volume "MO0039L3". Despooling 74,877,523,075 
bytes ...
Despooling elapsed time = 00:23:07, Transfer rate = 53.98 M Bytes/second

Based on all the backups I've examined in the past, 50MBytes/sec is 
generally about typical for me.

I have a few questions about your setup:

* do you have "Spool Attributes" enabled?  -- I think this may help.

* have you increased the "Maximum File Size" in your tape config? --My 
tape throughput went up somewhat when I went to 5G for the file size (on 
LTO4 you might even want it larger, my drive is LTO3).

* why are you using sqlite?  I was under the impression it was 
non-preferred for several reasons (I'm using mysql, but I believe 
postgres is somewhat preferred).

* have you increased the "Maximum Network Buffer Size" (both SD and FD) 
and "Maximum Block Size" (SD only)?  I have both at 256K, as I recall 
"Network Buffer Size" can't be larger (but I'm not sure that this will 
help your despool anyway), I think the block size change _may_ make 
previously written tapes incompatible, I don't remember why I went to 
256K on block size, but I believe the default is 64K.

If I were in your situation, I would see if "Spool Attributes" and 
"Maximum File Size" helped (these are easy to change, after all).  If 
that didn't help, I would move to a different database backend (this is 
probably a lot harder).  Then at that point, maybe "Maximum Block Size" 
(more reluctant to change if it does cause tape incompatibility).

-se


------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second 
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users