Bacula-users

[Bacula-users] slow copy job from disk to LTO-6 (speed issue)

2017-03-25 20:51:12
Subject: [Bacula-users] slow copy job from disk to LTO-6 (speed issue)
From: Caribe Schreiber <caribe AT auctionharmony DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Sat, 25 Mar 2017 19:23:07 -0500
Hi all,

I'm looking for some advice on how to diagnose a speed issue with a disk-to-tape copy job from a relatively fast RAID array to an LTO-6 drive over SAS.

My local copy jobs seem to be limited to ~23MB/s which is well below the 54-160MB/s speed specified for the adaptive streaming on this drive which leads me to believe that the drive is shoe-shining.

I've checked the write speed to the tape, and the read speed from disk and they seem to be fine as shown below with btape, bonnie++, and dd output.  What else should I be looking at?  Is this typical of a transfer between devices within a single SD instance?

I understand that compression and encryption in the FD is not multithreaded, so a local backup could be CPU limited on those two processes. The initial backup of the below job ran at about 7.3MB/s, but was encrypted and got fairly decent compression so I'm perfectly willing to pay the speed penalty on the backup side of the equation.  My understanding is that a copy job should not be affected by encryption or compression as it's just doing a straight copy of the existing data from one SD device to another (basically a spool-type process), but I'm still seeing very slow transfer rates on copy jobs.

Here's the output for a moderate size copy job going from local disk (FileChgr1-Dev1) to the SAS attached LTO-6 tape (Drive-1) showing the Drive -1 speed as 23.22MB/s:

backup-sd JobId 64 Sending spooled attrs to the Director. Despooling 2,615 bytes ...

Elapsed time=02:03:24, Transfer rate=7.346 M Bytes/second backup-dir JobId 64 No Files found to prune. End auto prune. Bacula backup-dir 7.4.4 (202Sep16): Build OS: x86_64-pc-linux-gnu gentoo JobId: 64 Job: BackupImage.2017-03-24_21.45.00_45 Backup Level: Full Client: "primary-fd" 7.4.4 (202Sep16) x86_64-pc-linux-gnu,gentoo, FileSet: "BackupFileSet" 2017-03-24 21:45:00 Pool: "file.full" (From Job resource) Catalog: "MyCatalog" (From Client resource) Storage: "file" (From Pool resource) Scheduled time: 24-Mar-2017 21:45:00 Start time: 25-Mar-2017 00:12:38 End time: 25-Mar-2017 02:16:03 Elapsed time: 2 hours 3 mins 25 secs Priority: 13 FD Files Written: 7 SD Files Written: 7 FD Bytes Written: 54,391,511,056 (54.39 GB) SD Bytes Written: 54,391,515,157 (54.39 GB) Rate: 7345.2 KB/s Software Compression: 57.8% 2.4:1 Snapshot/VSS: no Encryption: yes Accurate: no Volume name(s): full-Vol-0024|full-Vol-0029 Volume Session Id: 49 Volume Session Time: 1490104159 Last Volume Bytes: 22,911,377,270 (22.91 GB) Non-fatal FD errors: 0 SD Errors: 0 FD termination status: OK SD termination status: OK Termination: Backup OK Begin pruning Jobs older than 6 months . No Jobs found to prune. Begin pruning Files. backup-dir Using Device "Drive-1" to write. backup-sd Elapsed time=00:39:02, Transfer rate=23.22 M Bytes/second backup-sd Alert: smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.39-gentoo] (local build) Alert: Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org Alert: TapeAlert: OK Alert: === START OF READ SMART DATA SECTION === Alert: Alert: Error Counter logging not supported backup-sd Sending spooled attrs to the Director. Despooling 2,615 bytes ... Alert: Last n error events log page Alert: Alert:
I'm able to get good write speeds to the LTO-6 drive with btape as shown here: Wrote block=100000, file,blk=1,99999 VolBytes=6,451,135,488 rate=157.3 MB/s Wrote block=105000, file,blk=1,104999 VolBytes=6,773,695,488 rate=157.5 MB/s Wrote block=110000, file,blk=1,109999 VolBytes=7,096,255,488 rate=157.6 MB/s Wrote block=115000, file,blk=1,114999 VolBytes=7,418,815,488 rate=161.2 MB/s Wrote block=120000, file,blk=1,119999 VolBytes=7,741,375,488 rate=161.2 MB/s Wrote block=125000, file,blk=1,124999 VolBytes=8,063,935,488 rate=161.2 MB/s Wrote block=130000, file,blk=1,129999 VolBytes=8,386,495,488 rate=161.2 MB/s I'm also able to get good read speeds off the drive array as shown by bonnie++ here: backup ~ # bonnie++ -u root Using uid:0, gid:0. Writing a byte at a time...done Writing intelligently...done Rewriting...done Reading a byte at a time...done Reading intelligently...done start 'em...done...done...done...done...done... Create files in sequential order...done. Stat files in sequential order...done. Delete files in sequential order...done. Create files in random order...done. Stat files in random order...done. Delete files in random order...done. Version  1.97       ------Sequential Output------ --Sequential Input- --Random- Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks-- Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP backup      126G  1211  97 601130  47 240240  17  5506  94 759982  22 698.1  10 Latency              9305us     710ms     646ms   15409us     276ms     519ms Version  1.97       ------Sequential Create------ --------Random Create-------- backup           -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--               files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP                  16 16535  15 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++ Latency               385us     360us     714us     383us     370us     345us 1.97,1.97,backup,1,1490480328,126G,,1211,97,601130,47,240240,17,5506,94,759982,22,698.1,10,16,,,,,16535,15,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,9305us,710ms,646ms,15409us,276ms,519ms,385us,360us,714us,383us,370us,345us And it's pretty fast using dd to read a bacula file from the job as well: backup bacula # dd if=full-Vol-0029 of=/dev/null bs=4M 12799+1 records in 12799+1 records out 53687083817 bytes (54 GB, 50 GiB) copied, 111.637 s, 481 MB/s Here's my tape drive and local file storage config in the bacula-sd.conf: #   # A file changer for disk based backups # Autochanger {   Name = FileChgr1   Device = FileChgr1-Dev1   Changer Command = ""   Changer Device = /dev/null } Device {   Name = FileChgr1-Dev1   Media Type = file   Archive Device = /home/backups/bacula   LabelMedia = yes;                   # lets Bacula label unlabeled media   Random Access = Yes;   AutomaticMount = yes;               # when device opened, read it   RemovableMedia = no;   AlwaysOpen = no;   Maximum Concurrent Jobs = 5 } # # An HP 1/8 G2 Library device with one LTO-6 drive # Autochanger {   Name = AutochangerHP   Device = Drive-1   Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"   Changer Device = /dev/sg3 } Device {   Name = Drive-1                      #   Media Type = LTO-6   Archive Device = /dev/nst0   AutomaticMount = yes;               # when device opened, read it   AlwaysOpen = yes;   RemovableMedia = yes;   RandomAccess = no;   AutoChanger = yes   Maximum File Size = 10GB   Alert Command = "sh -c 'smartctl -H -l error %c'" } I'm a little stumped about why the raw hardware read/write speeds are so different from the bacula copy job speeds. I'm hoping the collective wisdom of the list can give me some ideas on other places to look for speed issues or config changes so that I don't prematurely wear out this drive head with shoe-shining (of course...faster backups are always welcome too!). Thanks in advance! ~Caribe
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users