Bacula-users

Re: [Bacula-users] LTO speed optimisation

2014-11-05 09:40:23
Subject: Re: [Bacula-users] LTO speed optimisation
From: Bryn Hughes <linux AT nashira DOT ca>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 05 Nov 2014 06:37:14 -0800
Hi Ben,

To start with I would watch the server using iostat and top to make sure you aren't running out of CPU and that you really aren't maxing out your disks during your backup.  In particular pay attention to the '%util' - I like to run 'iostat -kx 2' which will show the stats every 2 seconds.  Of course the usefulness of that will depend on your disk configuration - if you have a RAID card presenting a single LUN to the server then the numbers may be a bit faulty, but it should still give an indication of what is going on.  I see you are allowing 3 concurrent jobs on your tape drives - if all 3 of these are spooling from the same set of disks at once you may have some impact on performance since you are now requiring a lot of seek operations.

There are also some settings around buffers for bacula that I've had to tweak in the past to get ideal performance.  Check "Maximum Network Buffer Size" in your configs.

Finally there are some OS-level settings for the 'st' driver (I'm assuming you are on Linux).  With my LTO3 drives I need to add this to the kernel command line:

st=buffer_kbs:256,max_buffers:32

LTO6 may need similar tweaks.  Without the st driver buffer size and number increased I had a hard time keeping an LTO3 drive running at full speed (80MB/sec).

Bryn

On 2014-11-05 03:48 AM, Roberts, Ben wrote:
Hi all,

I'd like to try and make some speed improvements to my Bacula setup (5.2.13, Solaris11). I have data (and attribute) spooling enabled using a pool of 46x 1TB directly-attached SAS disks dedicated to this purpose. Data is being despooled to 2x directly-attached SAS LTO6 drives at around 100mB/sec each. I think I should be able to get closer to the ~160mB/s maximum uncompressed thoughput the drives and tape media support (ref: http://docs.oracle.com/cd/E38452_01/en/LTO6_Vol4_E1/LTO6_Vol4_E1.pdf).

I've just done a speed test and can read from the spool array at a sustained 300mB/sec even while other jobs are running, so I'm sure there's no bottleneck at the disk layer. My suspicion is that the bottleneck is at the application layer, probably due to the way I have Bacula configured.

Having read through Bareos' tuning paper (http://www.bareos.org/en/Whitepapers/articles/Speed_Tuning_of_Tape_Drives.html), I've updated the max file size from 1->50GB which increased the throughput from ~75 to ~100mB/sec. I believe I need to look at tuning the block size to gain the last bit of improvement.

Is it still the case in Bacula that changing the Maximum Block Size renders previously used/labelled tapes to become unreadable? I'm up to almost 1,000 tape media already written, so making these unusable for restores without restarting the SD to change configs would be less than ideal. I see Bareos is touting a feature to make changes to block size at the pool level rather than the storage level and so this problem can be avoided by moving newer backups to a different pool while still keeping older backups readable. I haven't seen any reference to this in the Bacula manual; is it something that's already supported or in the plans for a future version?

For reference, this is one of the the relevant drive definitions I'm using, just in case there's something else that would help which I might have missed:
Device {
Name = drive-1-tapestore1
Archive Device = /dev/rmt/1mbn
Device Type = Tape
Media Type = LTO6
AutoChanger = yes
Removable media = yes
Random access = no
Requires Mount = no
Drive Index = 1
Maximum Concurrent Jobs = 3
Maximum Spool Size = 1024G
Maximum File Size = 50G
Autoselect = yes
}

Regards,
Ben Roberts



------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>