Bacula-users

Re: [Bacula-users] slow copy job from disk to LTO-6 (speed issue)

2017-03-26 10:34:55
Subject: Re: [Bacula-users] slow copy job from disk to LTO-6 (speed issue)
From: Kern Sibbald <kern AT sibbald DOT com>
To: Caribe Schreiber <caribe AT auctionharmony DOT com>, bacula-users AT lists.sourceforge DOT net
Date: Sun, 26 Mar 2017 16:33:47 +0200

Hello,

As Laurent, I think, suggested you should not be doing Bacula compression if you are writing to an LTO-6 drive.

On a large, well tuned Bacula Enterprise installation, we see it getting backup speeds of 250 to 500GB/sec. You will probably never get close to that with a single backup job -- it requires running concurrent jobs and to have the right hardware -- a good part of it was outlined by Laurent. However to get speeds greater than say 150GB/sec, you either need to be a performance and hardware expert or have professional advice. High throughputs require knowledge of RAID, networking, fibre channel, tape drive characteristics, big hardware, and Bacula tuning experience (what directives to use and what values to set -- e.g. do not use Bacula compression, encryption, ... if you want the LTO-6 running full speed).

Best regards,

Kern


On 03/26/2017 05:24 AM, Caribe Schreiber wrote:
*sigh*

Do you folks ever have one of those days where it seems like you just can't win?  I'm apparently having one of those days.

Hopefully the mail client doesn't flub the line endings this time.

Slow copy job output:

backup-sd JobId 64 Sending spooled attrs to the Director. Despooling 2,615 bytes ...

 Elapsed time=02:03:24, Transfer rate=7.346 M Bytes/second
 backup-dir JobId 64 No Files found to prune.
 End auto prune.
 Bacula backup-dir 7.4.4 (202Sep16):
  Build OS:               x86_64-pc-linux-gnu gentoo 
  JobId:                  64
  Job:                    BackupImage.2017-03-24_21.45.00_45
  Backup Level:           Full
  Client:                 "primary-fd" 7.4.4 (202Sep16) x86_64-pc-linux-gnu,gentoo,
  FileSet:                "BackupFileSet" 2017-03-24 21:45:00
  Pool:                   "file.full" (From Job resource)
  Catalog:                "MyCatalog" (From Client resource)
  Storage:                "file" (From Pool resource)
  Scheduled time:         24-Mar-2017 21:45:00
  Start time:             25-Mar-2017 00:12:38
  End time:               25-Mar-2017 02:16:03
  Elapsed time:           2 hours 3 mins 25 secs
  Priority:               13
  FD Files Written:       7
  SD Files Written:       7
  FD Bytes Written:       54,391,511,056 (54.39 GB)
  SD Bytes Written:       54,391,515,157 (54.39 GB)
  Rate:                   7345.2 KB/s
  Software Compression:   57.8% 2.4:1
  Snapshot/VSS:           no
  Encryption:             yes
  Accurate:               no
  Volume name(s):         full-Vol-0024|full-Vol-0029
  Volume Session Id:      49
  Volume Session Time:    1490104159
  Last Volume Bytes:      22,911,377,270 (22.91 GB)
  Non-fatal FD errors:    0
  SD Errors:              0
  FD termination status:  OK
  SD termination status:  OK
  Termination:            Backup OK
 Begin pruning Jobs older than 6 months .
 No Jobs found to prune.
 Begin pruning Files.
backup-dir Using Device "Drive-1" to write.
backup-sd Elapsed time=00:39:02, Transfer rate=23.22 M Bytes/second
backup-sd Alert: smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.4.39-gentoo] (local build)
 Alert: Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
 Alert: TapeAlert: OK
 Alert: === START OF READ SMART DATA SECTION ===
 Alert:
 Alert: Error Counter logging not supported
 backup-sd Sending spooled attrs to the Director. Despooling 2,615 bytes ...
 Alert: Last n error events log page
 Alert:
 Alert:


dd read speed of one of the disk backup files in the job:

backup bacula # dd if=full-Vol-0029 of=/dev/null bs=4M
12799+1 records in
12799+1 records out
53687083817 bytes (54 GB, 50 GiB) copied, 111.637 s, 481 MB/s


Bonnie output:

backup ~ # bonnie++ -u root
Using uid:0, gid:0.
Writing a byte at a time...done
Writing intelligently...done
Rewriting...done
Reading a byte at a time...done
Reading intelligently...done
start 'em...done...done...done...done...done...
Create files in sequential order...done.
Stat files in sequential order...done.
Delete files in sequential order...done.
Create files in random order...done.
Stat files in random order...done.
Delete files in random order...done.
Version  1.97       ------Sequential Output------ --Sequential Input- --Random-
Concurrency   1     -Per Chr- --Block-- -Rewrite- -Per Chr- --Block-- --Seeks--
Machine        Size K/sec %CP K/sec %CP K/sec %CP K/sec %CP K/sec %CP  /sec %CP
   backup      126G  1211  97 601130  47 240240  17  5506  94 759982  22 698.1  10
Latency              9305us     710ms     646ms   15409us     276ms     519ms
Version  1.97       ------Sequential Create------ --------Random Create--------
   backup           -Create-- --Read--- -Delete-- -Create-- --Read--- -Delete--
              files  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP  /sec %CP
                 16 16535  15 +++++ +++ +++++ +++ +++++ +++ +++++ +++ +++++ +++
Latency               385us     360us     714us     383us     370us     345us
1.97,1.97,backup,1,1490480328,126G,,1211,97,601130,47,240240,17,5506,94,759982,22,698.1,10,16,,,,,16535,15,+++++,+++,+++++,+++,+++++,+++,+++++,+++,+++++,+++,9305us,710ms,646ms,15409us,276ms,519ms,385us,360us,714us,383us,370us,345us



SD config:

#
# A file changer for disk based backups
#
Autochanger {
  Name = FileChgr1
  Device = FileChgr1-Dev1
  Changer Command = ""
  Changer Device = /dev/null
}

Device {
  Name = FileChgr1-Dev1
  Media Type = file
  Archive Device = /home/backups/bacula
  LabelMedia = yes;                   # lets Bacula label unlabeled media
  Random Access = Yes;
  AutomaticMount = yes;               # when device opened, read it
  RemovableMedia = no;
  AlwaysOpen = no;
  Maximum Concurrent Jobs = 5
}



#
# An HP 1/8 autochanger device with one drive
#
Autochanger {
  Name = AutochangerHP
  Device = Drive-1
  Changer Command = "/usr/libexec/bacula/mtx-changer %c %o %S %a %d"
  Changer Device = /dev/sg3
}

Device {
  Name = Drive-1                      #
  Media Type = LTO-6
  Archive Device = /dev/nst0
  AutomaticMount = yes;               # when device opened, read it
  AlwaysOpen = yes;
  RemovableMedia = yes;
  RandomAccess = no;
  AutoChanger = yes
  Maximum File Size = 10GB
  Alert Command = "sh -c 'smartctl -H -l error %c'"
}

I'm a little stumped about why the raw hardware read/write speeds are so different from the bacula copy job speeds.

I'm hoping the collective wisdom of the list can give me some ideas on other places to look for speed issues or config changes so that I don't prematurely wear out this drive head with shoe-shining (of course...faster backups are always welcome too!).

Thanks in advance!

~Caribe

--
Caribe Schreiber


------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot


_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users