Bacula-users

[Bacula-users] SD Crash

2012-09-20 12:11:47
Subject: [Bacula-users] SD Crash
From: Brian Debelius <bdebelius AT intelesyscorp DOT com>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Thu, 20 Sep 2012 11:40:56 -0400
My SD started crashing while backing up a large CIFS share on a linux 
box to LTO-4 tape. This CIFS share is mounted locally on the backup 
server. This crashing started recently while using 5.2.10, but continues 
with 5.2.12.

Ubuntu 12.04 - Kernel 3.2.0-30-generic

** configure **
basedir="/opt/bacula-5.2.12"
workingdir="$basedir/working"

make distclean
CFLAGS="-g -O2" \
./configure \
--sbindir=$basedir/bin \
--sysconfdir=$basedir/etc \
--mandir=$basedir/bin \
--with-pid-dir=$workingdir \
--with-subsys-dir=$workingdir \
--with-working-dir=$workingdir \
--with-scriptdir=$basedir/bin \
--enable-smartalloc \
--enable-batch-insert \
--enable-largefile \
--disable-ipv6 \
--with-openssl \
--disable-conio \
--with-readline=/usr/include/readline/ \
--with-mysql
exit 0

** Bacula-sd.conf **
Device {
Name = Tape
Drive Index = 0
Device Type = Tape
Archive Device = /dev/nst0
Automatic Mount = yes
Removable Media = yes
Random Access = no
Media Type = LTO4
Autochanger = no
Auto Select = yes
Always Open = yes
Maximum Block Size = 131072
Maximum File Size = 4G
Maximum Spool Size = 100G
Maximum Job Spool Size = 20G
Maximum Network Buffer Size = 65536
Spool Directory = /data/spool/
}

** Debug Output **

bacula-sd: block.c:361-358 Write to spool
bacula-sd: block.c:361-358 Write to spool
bacula-sd: block.c:361-358 Write to spool
Bacula interrupted by signal 11: Segmentation violation
Kaboom! bacula-sd, bacula-sd got signal 11 - Segmentation violation. 
Attempting traceback.
Kaboom! exepath=/opt/bacula-5.2.12/bin
bacula-sd: signal.c:197-358 Working=/dev/shm
bacula-sd: signal.c:198-358 btpath=/opt/bacula-5.2.12/bin/btraceback
bacula-sd: signal.c:199-358 exepath=/opt/bacula-5.2.12/bin/bacula-sd
Calling: /opt/bacula-5.2.12/bin/btraceback 
/opt/bacula-5.2.12/bin/bacula-sd 28984 /dev/shm
It looks like the traceback worked ...
Dumping: /dev/shm/bacula-sd.28984.bactrace

** Backtrace **

[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7f2b3ffff700 (LWP 28990)]
[New Thread 0x7f2b3f7fe700 (LWP 28987)]
[New Thread 0x7f2b44caf700 (LWP 28985)]
0x00007f2b4598e823 in select () from /lib/x86_64-linux-gnu/libc.so.6
$1 = '\000' <repeats 29 times>
$2 = 0x2249058 "bacula-sd"
$3 = 0x2249098 "/opt/bacula-5.2.12/bin/bacula-sd"
$4 = 0x0
$5 = 0x7f2b461c39ae "5.2.12 (12 September 2012)"
$6 = 0x7f2b461c3988 "x86_64-unknown-linux-gnu"
$7 = 0x7f2b461c3981 "ubuntu"
$8 = 0x7f2b461c39a8 "12.04"
$9 = "bacula", '\000' <repeats 43 times>
$10 = 0x7f2b461c39a1 "ubuntu 12.04"
$11 = 0
Environment variable "TestName" not defined.
#0 0x00007f2b4598e823 in select () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f2b46192159 in bnet_thread_server (addr_list=<optimized out>, 
max_clients=41, client_wq=0x6557e0, handle_client_request=0x420dc0 
<handle_connection_request(void*)>) at bnet_server.c:177
#2 0x00000000004075db in main (argc=<optimized out>, argv=<optimized 
out>) at stored.c:285

Thread 4 (Thread 0x7f2b44caf700 (LWP 28985)):
#0 0x00007f2b45f6f52d in nanosleep () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f2b4618f021 in bmicrosleep (sec=30, usec=0) at bsys.c:106
#2 0x00007f2b461bd745 in check_deadlock () at lockmgr.c:574
#3 0x00007f2b45f67e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#4 0x00007f2b459954bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#5 0x0000000000000000 in ?? ()

Thread 3 (Thread 0x7f2b3f7fe700 (LWP 28987)):
#0 0x00007f2b45f6c0fe in pthread_cond_timedwait@@GLIBC_2.3.2 () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f2b461bdd30 in bthread_cond_timedwait_p (cond=0x7f2b463d3580, 
m=0x7f2b463d3540, abstime=0x7f2b3f7fddb0, file=0x7f2b461c6932 
"watchdog.c", line=321) at lockmgr.c:824
#2 0x00007f2b461b762c in watchdog_thread (arg=<optimized out>) at 
watchdog.c:321
#3 0x00007f2b461bd6a2 in lmgr_thread_launcher (x=0x224e2f8) at lockmgr.c:939
#4 0x00007f2b45f67e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#5 0x00007f2b459954bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#6 0x0000000000000000 in ?? ()

Thread 2 (Thread 0x7f2b3ffff700 (LWP 28990)):
#0 0x00007f2b45f6f88d in waitpid () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007f2b461aec7f in signal_handler (sig=11) at signal.c:229
#2 <signal handler called>
#3 0x00007f2b459ea581 in ?? () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x00000000004315d5 in write_data_to_block (rec=0x7f2b3fffe7d0, 
block=0x7f2b38005ef0) at record.c:373
#5 write_record_to_block (dcr=<optimized out>, rec=0x7f2b3fffe7d0) at 
record.c:491
#6 0x0000000000431617 in DCR::write_record (this=0x7f2b38005928, 
rec=0x7f2b3fffe7d0) at record.c:412
#7 0x000000000041015e in do_append_data (jcr=0x7f2b38001078) at append.c:216
#8 0x00000000004240fb in append_data_cmd (jcr=0x7f2b38001078) at 
fd_cmds.c:203
#9 0x0000000000424298 in do_fd_commands (jcr=0x7f2b38001078) at 
fd_cmds.c:162
#10 0x00000000004244a0 in run_job (jcr=0x7f2b38001078) at fd_cmds.c:122
#11 0x0000000000424e43 in run_cmd (jcr=0x7f2b38001078) at job.c:214
#12 0x000000000042116f in handle_connection_request (arg=0x224db08) at 
dircmd.c:235
#13 0x00007f2b461b7e25 in workq_server (arg=0x6557e0) at workq.c:344
#14 0x00007f2b461bd6a2 in lmgr_thread_launcher (x=0x224e1b8) at 
lockmgr.c:939
#15 0x00007f2b45f67e9a in start_thread () from 
/lib/x86_64-linux-gnu/libpthread.so.0
#16 0x00007f2b459954bd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#17 0x0000000000000000 in ?? ()

Thread 1 (Thread 0x7f2b469f6740 (LWP 28984)):
#0 0x00007f2b4598e823 in select () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007f2b46192159 in bnet_thread_server (addr_list=<optimized out>, 
max_clients=41, client_wq=0x6557e0, handle_client_request=0x420dc0 
<handle_connection_request(void*)>) at bnet_server.c:177
#2 0x00000000004075db in main (argc=<optimized out>, argv=<optimized 
out>) at stored.c:285
#0 0x00007f2b4598e823 in select () from /lib/x86_64-linux-gnu/libc.so.6
No symbol table info available.
#1 0x00007f2b46192159 in bnet_thread_server (addr_list=<optimized out>, 
max_clients=41, client_wq=0x6557e0, handle_client_request=0x420dc0 
<handle_connection_request(void*)>) at bnet_server.c:177
177 if ((stat = select(maxfd + 1, &sockset, NULL, NULL, NULL)) < 0) {
maxfd = 3
sockset = {fds_bits = {8, 0 <repeats 15 times>}}
clilen = 16
turnon = 1
buf = "10.2.0.10\000\000\000\000\000\000\000X8e", '\000' <repeats 13 
times>, "\005", '\000' <repeats 23 times>, "t\367~F+\177\000\000\005", 
'\000' <repeats 15 times>"\300, %\030F+\177\000\000u", '\000' <repeats 
23 times>, "\001", '\000' <repeats 14 times>
allbuf = "host[ipv4:0.0.0.0:9103] 
\000\262~F+\177\000\000\210\236\237F+\177\000\000=\000\000\000\377\177\000\000\240\032\263j\377\177\000\000\377\377\377\377\000\000\000\000\260\335\027F+\177\000\000H\346\027F+\177\000\000p\032\263j\377\177\000\000\000\000\000\000\000\000\000\000@\000\030F+\177\000\000\000\000\240F+\177\000\000\000\000\000\000\000\000\000\000t\252~F+\177\000\000\000\000\240F+\177\000\000\006\000\000\000\000\000\000\000\f\000\000\000\000\000\000\000t\252~F+\177\000\000\345ޓ\034\000\000\000\000\006\000\000\000\000\000\000\000\f\000\000\000\000\000\000\000蛟F+\177\000\000\250\275\360\273\000\000\000\000\236\262~F+\177\000\000\034T\212E+\177\000\000(\000\000\000+\177\000\000P\033\263j\377\177\000\000\006\000\000\000\000\000\000\000"...
stat = <optimized out>
tlog = <optimized out>
ipaddr = <optimized out>
fd_ptr = 0x0
sockfds = {<SMARTALLOC> = {<No data fields>}, head = 0x7fff6ab31790, 
tail = 0x7fff6ab31790, loffset = 0, num_items = 1}
newsockfd = <optimized out>
cli_addr = {sa_family = 2, sa_data = 
"\223l\n\002\000\n\000\000\000\000\000\000\000"}
next = <optimized out>
#2 0x00000000004075db in main (argc=<optimized out>, argv=<optimized 
out>) at stored.c:285
285 &dird_workq, handle_connection_request);
test_config = false
ch = <optimized out>
no_signals = <optimized out>
thid = 139823734060800
uid = 0x0
gid = 0x0
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.
#0 0x0000000000000000 in ?? ()
No symbol table info available.

------------------------------------------------------------------------------
Everyone hates slow websites. So do we.
Make your web apps faster with AppDynamics
Download AppDynamics Lite for free today:
http://ad.doubleclick.net/clk;258768047;13503038;j?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] SD Crash, Brian Debelius <=