On 4/10/12 4:36 PM, Steve Costaras wrote:
> I'm running bacula 5.2.6 under ubuntu 10.04LTS this is a pretty simple setup
> just backing up the same server that bacula is on as it's the main fileserver.
>
> For some background: The main fileserver array is comprised of 96 2TB drives
> in a raid-60 (16 raidz2 vdevs of 6 drives per group)). I have a seperate
> spool directory comprised of 16 2TB drives in a raid-10. All 112 drives are
> spread across 7 sas HBA's. Bacula's database (running sqlite v3.6.22) is on
> a SSD mirror. This is going to an LTO4 tape system. The base hardware of
> this system is a dual cpu X5680 box with 48GB of ram.
>
> doing cp, dd, and other tests to the spool directory is fine (getting over
> 1GB/s), likewise when bacula is running a full backup (10 jobs different
> directories in parrallel) I'm able to write to the spool directory (vmstat)
> at well over 700MB/s. btape and dd testing to the tape drive from the
> spool directory seems to be running fine at 114MB/s which is good.
>
> Now when bacula itself does the writing to the tape (normal backup process) I
> am only getting about 50-80MB/s which is pretty bad.
>
> I'm trying to narrow down what may be the slowdown here but am not familiar
> with the internals that well. My current thinking would be something in the
> line of:
>
> - bacula sd is not doing enough pre-fetching of the spool file?
>
> - perhaps when the spool data is being written to tape it needs to update
> the catalogue at that point? (the database is on a ssd mirror (Crucial C300
> series 50% overprovisioned) I do see some write bursts here (200 w/sec range)
> utilization is still well below 1% (iostat -x))
Your hardware is _much_ better than mine (spool in non-RAID, no SSD,
LTO3, 4GB RAM, etc), yet I get tape throughput not a lot lower. At the
despool rates you are seeing, I'm guessing you may be shoeshining the
tape on LTO4. Here's the results I saw from my most recent 3 full
backup jobs (presumably tape despool for incrementals is similar, but
mine are so small as to include way too much start/stop overhead):
Job1:
Committing spooled data to Volume "MO0039L3". Despooling 45,237,900,235
bytes ...
Despooling elapsed time = 00:13:11, Transfer rate = 57.19 M Bytes/second
Job2:
Writing spooled data to Volume. Despooling 85,899,607,651 bytes ...
Despooling elapsed time = 00:27:47, Transfer rate = 51.52 M Bytes/second
Committing spooled data to Volume "MO0039L3". Despooling 21,965,154,476
bytes ...
Despooling elapsed time = 00:09:19, Transfer rate = 39.29 M Bytes/second
Job3:
Committing spooled data to Volume "MO0039L3". Despooling 74,877,523,075
bytes ...
Despooling elapsed time = 00:23:07, Transfer rate = 53.98 M Bytes/second
Based on all the backups I've examined in the past, 50MBytes/sec is
generally about typical for me.
I have a few questions about your setup:
* do you have "Spool Attributes" enabled? -- I think this may help.
* have you increased the "Maximum File Size" in your tape config? --My
tape throughput went up somewhat when I went to 5G for the file size (on
LTO4 you might even want it larger, my drive is LTO3).
* why are you using sqlite? I was under the impression it was
non-preferred for several reasons (I'm using mysql, but I believe
postgres is somewhat preferred).
* have you increased the "Maximum Network Buffer Size" (both SD and FD)
and "Maximum Block Size" (SD only)? I have both at 256K, as I recall
"Network Buffer Size" can't be larger (but I'm not sure that this will
help your despool anyway), I think the block size change _may_ make
previously written tapes incompatible, I don't remember why I went to
256K on block size, but I believe the default is 64K.
If I were in your situation, I would see if "Spool Attributes" and
"Maximum File Size" helped (these are easy to change, after all). If
that didn't help, I would move to a different database backend (this is
probably a lot harder). Then at that point, maybe "Maximum Block Size"
(more reluctant to change if it does cause tape incompatibility).
-se
------------------------------------------------------------------------------
Better than sec? Nothing is better than sec when it comes to
monitoring Big Data applications. Try Boundary one-second
resolution app monitoring today. Free.
http://p.sf.net/sfu/Boundary-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|