Bacula-users

Re: [Bacula-users] Copy disk to tape is 4x slower than tar

2016-03-11 13:24:52
Subject: Re: [Bacula-users] Copy disk to tape is 4x slower than tar
From: Dan Langille <dan AT langille DOT org>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Fri, 11 Mar 2016 13:21:41 -0500

-- 
Dan Langille - BSDCan / PGCon




On Mar 10, 2016, at 11:09 PM, Kern Sibbald <kern AT sibbald DOT com> wrote:

Hello Dan,

Copying from disk to tape with Bacula's current algorithm is virtually guaranteed to be slower than using tar.  This is for several reasons:

1. Bacula currently is single threaded and reads a block from disk then stops to write the output block to tape.

2. After reading the block from disk, Bacula checks the checksum (CPU intensive).

3. After the checksum is verified, Bacula reads each record from the block and then packs the records into a new block for writing.

4. Once the output block is full, Bacula computes a checksum for the block, writes it and waits for the I/O to be complete.

So, as you can see, it is a very expensive process, but it is the only way to guarantee that the whole prior backup is properly copied.

I have not tried this, but one thing that may help a lot is to turn on data spooling for the tape device. This will probably not speed up the process but should prevent that tape shoe-shine (start and stopping).

At some later time, Bacula will have multiple threads -- one that reads and one that writes, and this could drastically improve performance.

I tried data spooling.

Original job, no spooling: 12752.6 KB/s

First, I tried spooling to local HHD: 11131.7 KB/s

Then I tried spooling to local SSD: 11542.1 KB/s

I have updated the gist with the bconsole output:https://gist.github.com/dlangille/2341a6da8f9ee836270c


Best regards
Kern

On 03/11/2016 08:44 AM, Dan Langille wrote:
On Mar 9, 2016, at 6:52 PM, Heitor Faria <heitor AT bacula.com DOT br> wrote:

I have a copy to tape job which copies from disk to tape using Bacula 7.4.0 and PostgreSQL 9.4 on FreeBSD 10.2

Everything is within one SD

Full details at https://gist.github.com/dlangille/2341a6da8f9ee836270c

The job summary:

 Start time:             09-Mar-2016 19:41:54
 End time:               09-Mar-2016 20:51:02
 Elapsed time:           1 hour 9 mins 8 secs
 Priority:               410
 SD Files Written:       1
 SD Bytes Written:       52,897,660,928 (52.89 GB)
 Rate:                   12752.6 KB/s

If I tar the volumes directly to tape, it takes only 16 minutes.


$ time sudo tar -cf /dev/nsa1 IncrAuto-4525 IncrAuto-4320 IncrAuto-4324 IncrAuto-4321 \
IncrAuto-4055 IncrAuto-4322 IncrAuto-4319 IncrAuto-3969 IncrAuto-3972 \
IncrAuto-3973 IncrAuto-3971 IncrAuto-4058

real    15m47.508s
user    0m5.844s
sys     1m51.834s

Spooling attributes is trivial:

09-Mar 20:51 crey-sd JobId 232945: Sending spooled attrs to the Director. Despooling 321 bytes ...
09-Mar 20:51 bacula-dir JobId 232944: Bacula bacula-dir 7.4.0 (16Jan16):

I am not sure where to look to figure this out.
Hello, Dan: maybe there is nothing to figure it out. Packing volumes with tar directly to tapes makes you unable to restore a single file or even a single job (with bootstrap) from an entire tape. I think it's a trade-off.

I have no desire to use tar.

I have a desire to effectively use an LTO tape drive.

--
Dan Langille - BSDCan / PGCon
dan AT langille DOT org






------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140



_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users


Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users