Re: [Bacula-users] bacula-5.2.6 slow full backup of localhost
2014-04-09 15:01:35
Thanks for the great suggestions. I
/think/ I have a better handle on what's going on after running
the 3 Bacula processes in the foreground with a high enough debug
level and piping each's stdout to a logfile. I noticed when the
bacula-dir logfile showed it was backing up
/var/log/bacula/bacula-sd.log (I might have the exact name wrong),
the backup (from the Director's perspective) never went to a new
file, but the fd component was busily writing ~64k chunks of data
according to its logfile. I've since added /var/log (and some
other /var subdirectories) to the Exclude list, and with the help
of the bacula-dir.dir log I should be able to figure out what file
it's hung on.
This sounds to me like the 'Accurate = yes' job option in the
Director config file...would that cause the hanging behavior I
described above? I'll try the strace -f option when it happens
again.
This doesn't really address what I perceive to be a "slow" backup
(rate of bytes backed up much less than max rate of data transfer
for the underlying storage subsystem), but if I can get a full
backup of the local system in a "reasonable" time, I can probably
finally bless Bacula as our new backup system and work on
performance later.
Thanks,
Anthony
On 04/08/2014 12:58 PM, Martin Simmons wrote:
Try using strace -f to examine all of the threads.
__Martin
On Tue, 08 Apr 2014 10:41:19 -0400, Anthony Jenkins said:
Well using dd(1) I get 103MB/s - this is while the Bacula backup of the
machine is ongoing:
root@venus:/home/ajenkins# dd if=/dev/zero of=/media/usb2/1GB_ZEROS
bs="$((1024*1024*1024))" count=1
1+0 records in
1+0 records out
1073741824 bytes (1.1 GB) copied, 10.4721 s, 103 MB/s
I don't think the bottleneck is the drive...
Anthony
On 04/08/2014 10:35 AM, Hans van Leeuwen wrote:
Hi Anthony,
What is the speed if you copy a big file with rsync --progress to the
same external USB drive?
Hans.
-----Original Message-----
*From*: Anthony Jenkins <ajenkins AT vtilt DOT com
<mailto:Anthony%20Jenkins%20%3cajenkins AT vtilt DOT com%3e>>
*To*: bacula-users AT lists.sourceforge DOT net
<mailto:bacula-users AT lists.sourceforge DOT net>
*Subject*: [Bacula-users] bacula-5.2.6 slow full backup of localhost
*Date*: Tue, 08 Apr 2014 09:33:38 -0400
I've skimmed the archive for hints, but most complain about remote
backups being slow...hopefully I just did something stupid in the config.
I'm getting 1,149,012 Bytes/sec on a full backup of several () GB of
data on the local Linux machine (dir, sd, fd on one host). This first
attempt at a backup has been running for about 30 hours and has written
~132 GB to a USB 3.0 external drive formatted with EXT4.
*status storage
Automatically selected Storage: File
Connecting to Storage daemon File at 127.0.0.1:9103
venus.vtilt.com-sd Version: 5.2.6 (21 February 2012)
x86_64-pc-linux-gnu ubuntu 13.10
Daemon started 07-Apr-14 00:44. Jobs: run=0, running=0.
Heap: heap=270,336 smbytes=172,237 max_bytes=237,811 bufs=117
max_bufs=123
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8 mode=0,0
Running Jobs:
Writing: Full Backup job BackupVenus-Full JobId=13 Volume="20140407"
pool="FullPool" device="FileStorage2" (/media/usb2)
spooling=0 despooling=0 despool_wait=0
Files=2,930,749 Bytes=132,395,015,432 Bytes/sec=1,149,012
FDReadSeqNo=25,659,941 in_msg=17673253 out_msg=5 fd=5
====
CPU load on the box is negligible (8-core Intel Core i7-4770K @ 350GHz,
16GB RAM).
root@venus:/home/ajenkins# cat /proc/meminfo
MemTotal: 16309728 kB
MemFree: 1115508 kB
Buffers: 2702536 kB
Cached: 4711008 kB
SwapCached: 9768 kB
Active: 3552560 kB
Inactive: 4157596 kB
...
root@venus:/home/ajenkins# vmstat
procs -----------memory---------- ---swap-- -----io---- -system--
----cpu----
r b swpd free buff cache si so bi bo in cs us
sy id wa
1 2 202440 1137260 2704460 4704228 0 0 49 88 0 1
1 1 97 1
root@venus:/home/ajenkins# mpstat
Linux 3.11.0-18-generic (venus.vtilt.com) 04/08/2014
_x86_64_ (8 CPU)
08:56:22 AM CPU %usr %nice %sys %iowait %irq %soft
%steal %guest %gnice %idle
08:56:22 AM all 1.25 0.00 0.82 1.21 0.00 0.00
0.00 0.00 0.00 96.71
root@venus:/home/ajenkins# ps auxww | grep '\(bacula\|mysql\|PID\)' |
grep -v grep
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
mysql 346 0.1 0.9 681340 150556 ? Ssl Mar31 12:42
/usr/sbin/mysqld
bacula 20022 0.3 0.0 211132 1636 ? Ssl Apr07 6:48
/usr/sbin/bacula-sd -c /etc/bacula/bacula-sd.conf -u bacula -g disk
root 20045 0.3 0.0 287516 15340 ? Ssl Apr07 6:31
/usr/sbin/bacula-fd -c /etc/bacula/bacula-fd.conf
bacula 20066 0.1 0.0 378976 2520 ? Ssl Apr07 2:42
/usr/sbin/bacula-dir -c /etc/bacula/bacula-dir.conf -u bacula -g bacula
The 'Ssl' says the bacula and mysql processes are asleep (at the time of
the ps(1) snapshot). I did an strace(1) of the 3 bacula processes; the
sd and fd processes were blocked in a select(2) call for a good while
waiting on descriptor 4, which are sockets connected to other bacula
processes it seems:
root@venus:/home/ajenkins# strace -p20022 -p20045 -p20066 -t 2>&1 |
tee bacula.strace
Process 20022 attached
Process 20045 attached
Process 20066 attached
[pid 20022] 09:15:06 select(4, [3], NULL, NULL, NULL <unfinished ...>
[pid 20066] 09:15:06 restart_syscall(<... resuming interrupted call
...> <unfinished ...>
[pid 20045] 09:15:06 select(4, [3], NULL, NULL, NULL <unfinished ...>
[pid 20066] 09:16:00 <... restart_syscall resumed> ) = 0
[pid 20066] 09:16:00 stat("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=3519, ...}) = 0
[pid 20066] 09:16:00 stat("/etc/localtime", {st_mode=S_IFREG|0644,
st_size=3519, ...}) = 0
...
ajenkins@venus:~$ sudo ls -l /proc/20022/fd | grep '\<4\>'
lrwx------ 1 root root 64 Apr 8 07:39 4 -> socket:[9896449]
ajenkins@venus:~$ sudo ls -l /proc/20045/fd | grep '\<4\>'
lrwx------ 1 root root 64 Apr 8 07:39 4 -> socket:[9896450]
ajenkins@venus:~$ sudo lsof -p 20022 | grep 98964
bacula-sd 20022 bacula 3u IPv4 9896420 0t0 TCP
localhost:bacula-sd (LISTEN)
bacula-sd 20022 bacula 4u IPv4 9896449 0t0 TCP
localhost:bacula-sd->localhost:45075 (ESTABLISHED)
bacula-sd 20022 bacula 5u IPv4 9896451 0t0 TCP
localhost:bacula-sd->localhost:45077 (ESTABLISHED)
ajenkins@venus:~$ sudo lsof -p 20045 | grep 98964
bacula-fd 20045 root 3u IPv4 9896430 0t0 TCP
localhost:bacula-fd (LISTEN)
bacula-fd 20045 root 4u IPv4 9896450 0t0 TCP
localhost:bacula-fd->localhost:33953 (ESTABLISHED)
ajenkins@venus:~$ sudo lsof | grep '4507[57]'
bacula-sd 20022 bacula 4u IPv4
9896449 0t0 TCP localhost:bacula-sd->localhost:45075
(ESTABLISHED)
bacula-sd 20022 bacula 5u IPv4
9896451 0t0 TCP localhost:bacula-sd->localhost:45077
(ESTABLISHED)
bacula-sd 20022 20027 bacula 4u IPv4
9896449 0t0 TCP localhost:bacula-sd->localhost:45075
(ESTABLISHED)
bacula-sd 20022 20027 bacula 5u IPv4
9896451 0t0 TCP localhost:bacula-sd->localhost:45077
(ESTABLISHED)
bacula-sd 20022 20078 bacula 4u IPv4
9896449 0t0 TCP localhost:bacula-sd->localhost:45075
(ESTABLISHED)
bacula-sd 20022 20078 bacula 5u IPv4
9896451 0t0 TCP localhost:bacula-sd->localhost:45077
(ESTABLISHED)
bacula-fd 20045 root 5u IPv4
9899062 0t0 TCP localhost:45077->localhost:bacula-sd
(ESTABLISHED)
bacula-fd 20045 20047 root 5u IPv4
9899062 0t0 TCP localhost:45077->localhost:bacula-sd
(ESTABLISHED)
bacula-fd 20045 20080 root 5u IPv4
9899062 0t0 TCP localhost:45077->localhost:bacula-sd
(ESTABLISHED)
bacula-fd 20045 20082 root 5u IPv4
9899062 0t0 TCP localhost:45077->localhost:bacula-sd
(ESTABLISHED)
bacula-di 20066 bacula 9u IPv4
9895396 0t0 TCP localhost:45075->localhost:bacula-sd
(ESTABLISHED)
bacula-di 20066 20072 bacula 9u IPv4
9895396 0t0 TCP localhost:45075->localhost:bacula-sd
(ESTABLISHED)
bacula-di 20066 20073 bacula 9u IPv4
9895396 0t0 TCP localhost:45075->localhost:bacula-sd
(ESTABLISHED)
bacula-di 20066 20077 bacula 9u IPv4
9895396 0t0 TCP localhost:45075->localhost:bacula-sd
(ESTABLISHED)
bacula-di 20066 20079 bacula 9u IPv4
9895396 0t0 TCP localhost:45075->localhost:bacula-sd
(ESTABLISHED)
How do I increase the logging while the bacula daemons are running?
bconsole 'setdebug' command? Any advice on next debugging steps?
Thanks in advance,
--
Anthony Jenkins, Software Engineer
VTilt Digital, LLC.
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
--
Anthony Jenkins, Software Engineer
VTilt Digital, LLC.
|
------------------------------------------------------------------------------
Put Bad Developers to Shame
Dominate Development with Jenkins Continuous Integration
Continuously Automate Build, Test & Deployment
Start a new project now. Try Jenkins in the cloud.
http://p.sf.net/sfu/13600_Cloudbees _______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|
|
|