Re: [Bacula-users] LTO speed optimisation
2014-11-05 09:40:23
Hi Ben,
To start with I would watch the server using iostat and top to
make sure you aren't running out of CPU and that you really aren't
maxing out your disks during your backup. In particular pay
attention to the '%util' - I like to run 'iostat -kx 2' which will
show the stats every 2 seconds. Of course the usefulness of that
will depend on your disk configuration - if you have a RAID card
presenting a single LUN to the server then the numbers may be a
bit faulty, but it should still give an indication of what is
going on. I see you are allowing 3 concurrent jobs on your tape
drives - if all 3 of these are spooling from the same set of disks
at once you may have some impact on performance since you are now
requiring a lot of seek operations.
There are also some settings around buffers for bacula that I've
had to tweak in the past to get ideal performance. Check "Maximum
Network Buffer Size" in your configs.
Finally there are some OS-level settings for the 'st' driver (I'm
assuming you are on Linux). With my LTO3 drives I need to add
this to the kernel command line:
st=buffer_kbs:256,max_buffers:32
LTO6 may need similar tweaks. Without the st driver buffer size
and number increased I had a hard time keeping an LTO3 drive
running at full speed (80MB/sec).
Bryn
On 2014-11-05 03:48 AM, Roberts, Ben wrote:
Hi all,
I'd like to try and make some speed improvements to my Bacula
setup (5.2.13, Solaris11). I have data (and attribute) spooling
enabled using a pool of 46x 1TB directly-attached SAS disks
dedicated to this purpose. Data is being despooled to 2x
directly-attached SAS LTO6 drives at around 100mB/sec each. I
think I should be able to get closer to the ~160mB/s maximum
uncompressed thoughput the drives and tape media support (ref: http://docs.oracle.com/cd/E38452_01/en/LTO6_Vol4_E1/LTO6_Vol4_E1.pdf).
I've just done a speed test and can read from the spool array at a
sustained 300mB/sec even while other jobs are running, so I'm sure
there's no bottleneck at the disk layer. My suspicion is that the
bottleneck is at the application layer, probably due to the way I
have Bacula configured.
Having read through Bareos' tuning paper
(http://www.bareos.org/en/Whitepapers/articles/Speed_Tuning_of_Tape_Drives.html),
I've updated the max file size from 1->50GB which increased the
throughput from ~75 to ~100mB/sec. I believe I need to look at
tuning the block size to gain the last bit of improvement.
Is it still the case in Bacula that changing the Maximum Block
Size renders previously used/labelled tapes to become unreadable?
I'm up to almost 1,000 tape media already written, so making these
unusable for restores without restarting the SD to change configs
would be less than ideal. I see Bareos is touting a feature to
make changes to block size at the pool level rather than the
storage level and so this problem can be avoided by moving newer
backups to a different pool while still keeping older backups
readable. I haven't seen any reference to this in the Bacula
manual; is it something that's already supported or in the plans
for a future version?
For reference, this is one of the the relevant drive definitions
I'm using, just in case there's something else that would help
which I might have missed:
Device {
Name = drive-1-tapestore1
Archive Device = /dev/rmt/1mbn
Device Type = Tape
Media Type = LTO6
AutoChanger = yes
Removable media = yes
Random access = no
Requires Mount = no
Drive Index = 1
Maximum Concurrent Jobs = 3
Maximum Spool Size = 1024G
Maximum File Size = 50G
Autoselect = yes
}
Regards,
Ben Roberts
|
------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|
|
|