Bacula-users

Re: [Bacula-users] Copy disk to tape is 4x slower than tar

2016-03-10 10:15:52
Subject: Re: [Bacula-users] Copy disk to tape is 4x slower than tar
From: Simon Templar <stemplar AT opposedtwin DOT com>
To: Dan Langille <dan AT langille DOT org>
Date: Thu, 10 Mar 2016 10:10:21 -0500
Dan,

I’m new to bacula, and arguably not very smart, but I’ve been struggling with tape drive performance pretty much since the moment I got the configurations to a functional state so I’ll share my learnings thus far.

Can you hear your tape drive? If so, do you hear lots of stops and starts when bacula is running the job? When mine was writing I’d hear lots of stops and starts (shoe-shining, I believe it’s called) followed by long pauses of silence.

I have a single LTO6 drive, and so far I’ve made 2 changes to my setup that brought my average write speed from less than 60 MB/s to between 130-160 MB/s.

First, the long pauses with no activity were found to be related to spooling. I started by spooling to an idle set of 6 sata drives in a raidz, then later switched to an ssd, and with either the behavior was that the tape drive would be completely idle while the data was spooled, then the data was written to the drive, then when it all was written the tape drive was completely idle again while data was spooled. Needless to say this added a long time to the backups for our platform. Our data ultimately resides on another platform (a freenas) and the goal is to replicate it to our backup server, then write to tape for archival purposes. It turns out that I can write the 15tb directly from the nfs mount to tape faster than I can spool it from local disk to ssd then write to tape. I’m sure something’s wrong with my setup, but if you’re spooling it’s worth taking a look to see if you have long tape drive idle times while the data is spooled. I would have guessed that the spool directory should be kept as full as possible WHILE the spooled data is written to tape, but that’s not the behavior I’m seeing.

The second thing I did was add:
Maximum File Size = 20GB

to my bacula-sd.conf. I probably need to tune this further, and it might make restores slower (at least for individual files?) but it drastically cut down on my drive starts and stops. Together, the maximum file size and elimination of spooling more than doubled my average write speed in large backups.

-Simon

P.S. Is your tar write speed as fast as you’d expect for your LTO drive?


On Mar 9, 2016, at 6:14 PM, Dan Langille <dan AT langille DOT org> wrote:

I have a copy to tape job which copies from disk to tape using Bacula 7.4.0 and PostgreSQL 9.4 on FreeBSD 10.2

Everything is within one SD


The job summary:

 Start time:             09-Mar-2016 19:41:54
 End time:               09-Mar-2016 20:51:02
 Elapsed time:           1 hour 9 mins 8 secs
 Priority:               410
 SD Files Written:       1
 SD Bytes Written:       52,897,660,928 (52.89 GB)
 Rate:                   12752.6 KB/s

If I tar the volumes directly to tape, it takes only 16 minutes.


$ time sudo tar -cf /dev/nsa1 IncrAuto-4525 IncrAuto-4320 IncrAuto-4324 IncrAuto-4321 \
> IncrAuto-4055 IncrAuto-4322 IncrAuto-4319 IncrAuto-3969 IncrAuto-3972 \
> IncrAuto-3973 IncrAuto-3971 IncrAuto-4058

real    15m47.508s
user    0m5.844s
sys     1m51.834s

Spooling attributes is trivial:

09-Mar 20:51 crey-sd JobId 232945: Sending spooled attrs to the Director. Despooling 321 bytes ...
09-Mar 20:51 bacula-dir JobId 232944: Bacula bacula-dir 7.4.0 (16Jan16):

I am not sure where to look to figure this out.

Full job output:


09-Mar 19:41 bacula-dir JobId 232944: Warning: FileSet MD5 digest not found.
09-Mar 19:41 bacula-dir JobId 232944: The following 1 JobId was chosen to be copied: 232778
09-Mar 19:41 bacula-dir JobId 232944: Copying using JobId=232778 Job=BackupCatalog.2016-03-08_03.05.16_52
09-Mar 19:41 bacula-dir JobId 232944: Bootstrap records written to /usr/local/bacula/working/bacula-dir.restore.107.bsr
09-Mar 19:41 bacula-dir JobId 232944: Start Copying JobId 232944, Job=CopyToTape-Full-Just-One-tape-01b.2016-03-09_19.41.51_42
09-Mar 19:41 bacula-dir JobId 232944: Using Device "vDrive-0" to read.
09-Mar 19:41 bacula-dir JobId 232945: Using Device "LTO_0" to write.
09-Mar 19:41 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4525" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 19:41 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4525" to file:block 0:1207281748.
09-Mar 19:47 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4525"
09-Mar 19:47 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4320" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 19:47 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4320" to file:block 0:3844390440.
09-Mar 19:49 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4320"
09-Mar 19:49 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4324" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 19:49 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4324" to file:block 0:3867148676.
09-Mar 19:51 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4324"
09-Mar 19:51 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4321" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 19:51 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4321" to file:block 0:214.
09-Mar 19:59 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4321"
09-Mar 19:59 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4055" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 19:59 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4055" to file:block 0:214.
09-Mar 20:06 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4055"
09-Mar 20:06 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4322" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:06 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4322" to file:block 0:214.
09-Mar 20:13 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4322"
09-Mar 20:13 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4319" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:13 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4319" to file:block 0:214.
09-Mar 20:20 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4319"
09-Mar 20:20 crey-sd JobId 232944: Ready to read from volume "IncrAuto-3969" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:20 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-3969" to file:block 0:218.
09-Mar 20:27 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-3969"
09-Mar 20:27 crey-sd JobId 232944: Ready to read from volume "IncrAuto-3972" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:27 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-3972" to file:block 0:218.
09-Mar 20:34 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-3972"
09-Mar 20:34 crey-sd JobId 232944: Ready to read from volume "IncrAuto-3973" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:34 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-3973" to file:block 0:218.
09-Mar 20:40 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-3973"
09-Mar 20:40 crey-sd JobId 232944: Ready to read from volume "IncrAuto-3971" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:40 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-3971" to file:block 0:218.
09-Mar 20:47 crey-sd JobId 232944: End of Volume at file 1 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-3971"
09-Mar 20:47 crey-sd JobId 232944: Ready to read from volume "IncrAuto-4058" on file device "vDrive-0" (/usr/local/bacula/volumes).
09-Mar 20:47 crey-sd JobId 232944: Forward spacing Volume "IncrAuto-4058" to file:block 0:214.
09-Mar 20:50 crey-sd JobId 232944: End of Volume at file 0 on device "vDrive-0" (/usr/local/bacula/volumes), Volume "IncrAuto-4058"
09-Mar 20:50 crey-sd JobId 232944: End of all volumes.
09-Mar 20:50 crey-sd JobId 232944: Elapsed time=01:09:04, Transfer rate=12.76 M Bytes/second
09-Mar 20:50 crey-sd JobId 232945: Elapsed time=01:09:04, Transfer rate=12.76 M Bytes/second
09-Mar 20:51 crey-sd JobId 232945: Sending spooled attrs to the Director. Despooling 321 bytes ...
09-Mar 20:51 bacula-dir JobId 232944: Bacula bacula-dir 7.4.0 (16Jan16):
  Build OS:               amd64-portbld-freebsd10.2 freebsd 10.2-RELEASE-p8
  Prev Backup JobId:      232778
  Prev Backup Job:        BackupCatalog.2016-03-08_03.05.16_52
  New Backup JobId:       232945
  Current JobId:          232944
  Current Job:            CopyToTape-Full-Just-One-tape-01b.2016-03-09_19.41.51_42
  Backup Level:           Full
  Client:                 crey-fd
  FileSet:                "EmptyCopyToTape" 2011-02-20 20:53:31
  Read Pool:              "FullFile" (From Job resource)
  Read Storage:           "CreyFile" (From Pool resource)
  Write Pool:             "FullsLTO4" (From Job resource)
  Write Storage:          "tape01" (From Job resource)
  Catalog:                "MyCatalog" (From Client resource)
  Start time:             09-Mar-2016 19:41:54
  End time:               09-Mar-2016 20:51:02
  Elapsed time:           1 hour 9 mins 8 secs
  Priority:               410
  SD Files Written:       1
  SD Bytes Written:       52,897,660,928 (52.89 GB)
  Rate:                   12752.6 KB/s
  Volume name(s):         000005L4
  Volume Session Id:      264
  Volume Session Time:    1457220166
  Last Volume Bytes:      386,429,912,064 (386.4 GB)
  SD Errors:              0
  SD termination status:  OK
  Termination:            Copying OK

09-Mar 20:51 bacula-dir JobId 232944: Begin pruning Jobs older than 3 years .
09-Mar 20:51 bacula-dir JobId 232944: No Jobs found to prune.
09-Mar 20:51 bacula-dir JobId 232944: Begin pruning Files.
09-Mar 20:51 bacula-dir JobId 232944: No Files found to prune.
09-Mar 20:51 bacula-dir JobId 232944: End auto prune.


-- 
Dan Langille - BSDCan / PGCon




------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785111&iu=/4140
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users