Bacula-users

[Bacula-users] multiple spool files / concurrent spooling/despooing again

2011-08-15 13:58:22
Subject: [Bacula-users] multiple spool files / concurrent spooling/despooing again
From: mark.bergman AT uphs.upenn DOT edu
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 15 Aug 2011 13:55:02 -0400
While I'm not able to contribute patches, I'd like to voice my support
for the concept of having multiple spool files to enable concurrent
spooling & de-spooling that Ralph Gross brought up in 2007[1] and which
Jesper Krogh submitted as a feature request in 2009[2].

Problem synopsis:

        Data spooling is a necessity to get good throughput to modern,
        high-speed tape drives, but bacula pauses the spooling as soon as
        it starts writing to tape, which drastically reduces throughput.

Here are some numbers from our environment:

[A]     Throughput w/o spooling: ~22MB/s
                this represents the aggregate of the speed to read data from
                disk and write to tape, with shoe-shining, network congestion,
                disk contention, etc.

[B]     Throughput to spool file: ~55MB/s
                this represents the aggregate of the speed to read data from
                disk (a 9TB logical volume made up from multiple RAID5 and RAID6
                LUNs) and write to the RAID-10 spool partition. This includes
                any network congestion, disk contention, etc.
        
[C]     Throuput from disk spool file to LTO-4 tape: ~108MB/s
                This is the raw despooling-speed.

[D]     End-to-end throughput with spooling: ~27MB/s
                This is very disappointing...this is the overall throughput of
                [B] + [C] above. While eliminating shoe-shining is
                much better for the tape media and tape drive, the
                overall performance is almost identical to [A], while
                it should be close to [B]. The reason for the decrease
                in performance is that bacula stops all spooling as soon
                as it starts de-spooling.

In an ideal configuration, there could be multiple spool directories defined,
and bacula would open a new spool file in the next directory as soon as it
begins despooling. An example bacula-sd configuration might contain:

        Spool Directory=/raid0-A/spool
        Spool Directory=/raid0-B/spool
        Spool Directory=/raid0-C/spool
        Concurrent Spool=yes

where each "/raid0-*" mount point is a separate RAID-0 array, so as to
minimize contention.

The "Concurrent Spool" option would determine whether spooling follows
the existing behavior, or if multiple spool files (possibly in different
directories) are used concurrently for spooling and despooling.

If the user defines a single spool directory (as in the current
configuration), and does not defined "Concurrent Spool = yes", the
existing behavior would occur.


[1] http://copilotco.com/mail-archives/bacula-devel.2007/msg02642.html
[2] http://www.bacula.org/git/cgit.cgi/bacula/plain/bacula/projects?h=Branch-5.1

Thanks,

Mark


------------------------------------------------------------------------------
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] multiple spool files / concurrent spooling/despooing again, mark . bergman <=