Bacula-users

Re: [Bacula-users] bacula-fd crashes on FreeBSD 9.2

2013-11-25 01:42:30
Subject: Re: [Bacula-users] bacula-fd crashes on FreeBSD 9.2
From: Kern Sibbald <kern AT sibbald DOT com>
To: David Newman <dnewman AT networktest DOT com>, bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Mon, 25 Nov 2013 07:37:58 +0100
Hello David,

I have been on vacation since 4 November (in Belgium) and will not
be back home until 1 December.  I am busy basically from 8:15am
until 10pm, with only a few pauses, so am not able to look at this.

If the problem is only in the FreeBSD start scripts (not in the ./bacula
script)
then I suggest you submit a FreeBSD bug report.  However, I haven't seen
any other reports like yours which means that it is probably a support
issue because something special in your environment.

Best regards,
Kern

On 11/25/2013 02:36 AM, David Newman wrote:
> Apologies for top posting. Kern and Dan asked for more information on
> this issue awhile back, and I'd provided it (see below, or the list
> archives) in two messages on 10 November.
>
> I'm OK for now by running from the binary, but a backup will crash when
> the binary is called from the FreeBSD startup script.
>
> Thanks in advance for additional troubleshooting clues.
>
> Also, should I instead file a FreeBSD PR for this?
>
> dn
>
>
>
> On 11/10/13, 5:12 PM, David Newman wrote:
>> On 11/10/13 12:09 PM, Dan Langille wrote:
>>> On Nov 10, 2013, at 2:02 PM, David Newman <dnewman AT networktest DOT com> 
>>> wrote:
>>>
>>>> On 11/9/13, 9:33 AM, Dan Langille wrote:
>>>>> On Nov 8, 2013, at 7:51 PM, David Newman <dnewman AT networktest DOT com> 
>>>>> wrote:
>>>>>
>>>>>> On 11/7/13 6:17 AM, Dan Langille wrote:
>>>>>>> On 2013-11-06 21:05, David Newman wrote:
>>>>>>>> On 11/5/13 5:53 PM, Dan Langille wrote:
>>>>>>>> You are on 9.2-release.
>>>>>>>>
>>>>>>>> Have you run freebsd-update to get the latest security patches?
>>>>>>>>
>>>>>>>> Yes
>>>>>>>>
>>>>>>>>
>>>>>>>> Did you see the post by Dean E. Weimer today?
>>>>>>>>
>>>>>>>> The most recent post from Dean to this list (at least that I have) is
>>>>>>>> from 4 November at 2030 UTC, saying essentially that a complete rebuild
>>>>>>>> of the OS solved his problem.
>>>>>>>>
>>>>>>>> I'm hoping not to have to boil the ocean...
>>>>>>>>
>>>>>>>>
>>>>>>>> Second: read below.
>>>>>>>>
>>>>>>>> With your help (thanks), I got the debug version built and running.
>>>>>>>> Two things:
>>>>>>>>
>>>>>>>> 1. Just like before, the binary in /usr/local/sbin/bacula-fd runs fine
>>>>>>>> when launched on its own. By "runs" I mean the director successfully
>>>>>>>> completes a backup job.
>>>>>>>>
>>>>>>>> 2. Just like before, the binary in /usr/local/sbin/bacula-fd crashed
>>>>>>>> when called from the startup script in /usr/local/etc/rc.d/bacula-fd. 
>>>>>>>> By
>>>>>>>> "crashed" I mean the client machine's fd daemon dies during a backup 
>>>>>>>> job.
>>>>>>> Please paste the output of ps auwx | grep bacula-fd for both of the
>>>>>>> above scenarios.  I expect to see something like this:
>>>>>>>
>>>>>>> # ps auwx | grep bacula-fd
>>>>>>> root     1364  0.0  0.4 10156  4192  ??  Is    1:47PM   0:15.71
>>>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
>>>>>>> /usr/local/etc/bacula-fd.conf
>>>>>> Here's the binary by itself:
>>>>>>
>>>>>> root  27754   0.0  0.3  18696  5548 ??  Ss    4:01PM    0:00.00
>>>>>> /usr/local/sbin/bacula-fd
>>>>> That’s just bacula-fd raw, no parameters.  Hmmm.
>>>>>
>>>>>> and here it is called from /usr/local/etc/rc.d/bacula-fd (with debugging
>>>>>> on):
>>>>>>
>>>>>> root  28337   0.0  0.3  18696  5612 ??  Ss    4:15PM    0:00.00
>>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
>>>>>> /usr/local/etc/bacula/bacula-fd.conf
>>>>>>
>>>>>> and once more, compiled with debug off:
>>>>>> root  37399   0.0  0.3  18696  5608 ??  Ss    4:38PM    0:00.00
>>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
>>>>>> /usr/local/etc/bacula/bacula-fd.conf
>>>>> Those are identical.  Good.
>>>>>
>>>>>>
>>>>>>>> I've pasted below the crash output to STDERR. Thanks in advance for 
>>>>>>>> more
>>>>>>>> troubleshooting clues.
>>>>>>> For the below, I think you have to find btraceback and get that
>>>>>>> installed to /usr/local/sbin/btraceback
>>>>>> OK, I found something: This is a problem related to bsmtp. (Again, this
>>>>>> is on an i386 machine.)
>>>>>>
>>>>>> With debugging on and btraceback in place, the debugger complains it
>>>>>> can't find bstmp. I copy the bsmtp directory from under the
>>>>>> bacula-client port into /usr/local/sbin and a backup produces a
>>>>>> complaint about permissions.
>>>>>>
>>>>>> I do 'chmod -R 777 /usr/local/bstmp' and -- lo and behold, the backup
>>>>>> now runs with bacula-fd called from the startup script. There's no debug
>>>>>> output because it works.
>>>>>> Then I uninstall bacula-client and back out of all the debugging
>>>>>> changes, both in the bacula-client and bacula-server directories, run
>>>>>> 'make clean' in both directories, and reinstall bacula-client. I again
>>>>>> put the bsmtp directory into /usr/local/sbin and again chmod 777 it.
>>>>>>
>>>>>> Now, bacula-fd crashes same as before.
>>>>> Try the non-debug and debug versions like this, start them from the 
>>>>> command line.
>>>>>
>>>>> /usr/local/sbin/bacula-fd -f -u root -g wheel -c 
>>>>> /usr/local/etc/bacula/bacula-fd.conf
>>>>>
>>>>> The -f ensures bacula-fd will stay in the foreground.  Try the backup?  
>>>>> Any messages?
>>>>>
>>>> Yes. I've pasted the output of both here:
>>>>
>>>> http://pastebin.com/iPEqYDUb
>>>>
>>>> This time, both debug and non-debug versions failed to complete a
>>>> backup. Not sure what changed from before.
>>> For starters, you’re running in the foreground… That may affect things.  I 
>>> do not know for sure.
>>>
>>>> This is the only instance I have of bacula-fd on i386 on FreeBSD. There
>>>> are other i386 machines running bacula-fd but they're on other OSs such
>>>> as Linux and OpenBSD. The fact that the backup runs OK from the raw
>>>> binary and doesn't run from the startup script suggests this may be a
>>>> FreeBSD-specific issue.
>>> I wonder if this might be a compiler optimization issue.  That is, the 
>>> compiler is attempting
>>> to implement an optimization but optimization causes the problem in 
>>> question.  I’m out of time today, or I’d look into that now.
>> The bacula-fd binary by itself works OK. The problem occurs when it's
>> called from the startup script.
>>
>> I don't see anything in particular in that script that would cause
>> different behavior in the binary, but in case it wasn't abundantly clear
>> by now, I'm not an expert on bacula internals.
>>
>> Thanks in advance for more troubleshooting clues.
>>
>> dn
>>
>>
>>>> dn
>>>>
>>>>> No?  Try with -d 9
>>>>>
>>>>> /usr/local/sbin/bacula-fd -f -9d -u root -g wheel -c 
>>>>> /usr/local/etc/bacula/bacula-fd.conf
>>>>>
>>>>>> So the situation is that backups to this machine work if the binary runs
>>>>>> without being called from the startup script *or* if the binary is
>>>>>> compiled with debugging flags.
>>>>>>
>>>>>> How to remedy?
>>>>> I wish I knew what was different about this system from all your others
>>>>> which work.  Is this the only i386?
>>>>>
>>>>>>
>>>>>>> When you compiled, if you didn't do a make clean, it should be somewhere
>>>>>>> under the work directory in /usr/ports/sysutils/bacula-client
>>>>>>>
>>>>>>>> dn
>>>>>>>>
>>>>>>>>
>>>>>>>> root@o:/usr/ports/sysutils/bacula-client # 
>>>>>>>> /usr/local/etc/rc.d/bacula-fd
>>>>>>>> start
>>>>>>  >> Starting bacula_fd.
>>>>>>>> root@o:/usr/ports/sysutils/bacula-client # Bacula interrupted by signal
>>>>>>>> 0: UNKNOWN SIGNAL
>>>>>>>> Kaboom! bacula-fd, o-fd got signal 0 - UNKNOWN SIGNAL. Attempting
>>>>>>>> traceback.
>>>>>>>> Kaboom! exepath=/usr/local/sbin/
>>>>>>>> Calling: /usr/local/sbin/btraceback /usr/local/sbin/bacula-fd 21541
>>>>>>>> /var/db/bacula
>>>>>>>> execv: /usr/local/sbin/btraceback failed: ERR=No such file or directory
>>>>>>>> It looks like the traceback worked ...
>>>>>>>> Dumping: /var/db/bacula/o-fd.21541.bactrace
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 28804618 from
>>>>>>>> bnet.c:774
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2885c2d8 from
>>>>>>>> jcr.c:358
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 28832a98 from
>>>>>>>> bnet.c:767
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 28a69018 from
>>>>>>>> bnet.c:773
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 7 bytes at 2880dcd8 from
>>>>>>>> bnet.c:775
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 28831508 from
>>>>>>>> bnet.c:776
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 8 bytes at 28831538 from
>>>>>>>> workq.c:162
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 16 bytes at 28831568 from
>>>>>>>> jcr.c:347
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29008218 from
>>>>>>>> jcr.c:360
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c158 from
>>>>>>>> jcr.c:362
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 536 bytes at 29008518 from
>>>>>>>> find.c:63
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c298 from
>>>>>>>> find.c:66
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030198 from
>>>>>>>> job.c:248
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c3d8 from
>>>>>>>> job.c:249
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 21 bytes at 29066058 from
>>>>>>>> job.c:251
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069038 from
>>>>>>>> tls.c:422
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 40 bytes at 29067078 from
>>>>>>>> job.c:1736
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 56 bytes at 2906a1d8 from
>>>>>>>> job.c:803
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a238 from
>>>>>>>> job.c:933
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069098 from
>>>>>>>> alist.c:51
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 352 bytes at 29640318 from
>>>>>>>> job.c:968
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069df8 from
>>>>>>>> alist.c:51
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066088 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 17 bytes at 290660e8 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 30 bytes at 290301d8 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066118 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066148 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066178 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661a8 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 36 bytes at 29030258 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a298 from
>>>>>>>> job.c:916
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069db8 from
>>>>>>>> alist.c:51
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661d8 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066208 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066238 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 29066268 from
>>>>>>>> dlist.c:356
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 29004318 from
>>>>>>>> bsock.c:64
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 2978d018 from
>>>>>>>> bsock.c:73
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29009d18 from
>>>>>>>> bsock.c:74
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 290662c8 from
>>>>>>>> bsock.c:159
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030398 from
>>>>>>>> bsock.c:160
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069d38 from
>>>>>>>> tls.c:422
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 77 bytes at 2906d1e8 from
>>>>>>>> job.c:572
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 32 bytes at 29030318 from
>>>>>>>> runscript.c:51
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905d418 from
>>>>>>>> runscript.c:203
>>>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 290303d8 from
>>>>>>>> bpipe.c:76
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Nov 4, 2013, at 5:44 PM, David Newman <dnewman AT networktest DOT 
>>>>>>>> com> wrote:
>>>>>>>>
>>>>>>>> On 10/29/13 12:42 PM, Dan Langille wrote:
>>>>>>>> On 2013-10-27 19:33, David Newman wrote:
>>>>>>>> On 10/27/13 11:31 AM, Dan Langille wrote:
>>>>>>>>
>>>>>>>> On Oct 22, 2013, at 3:00 PM, David Newman wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On 10/19/13 11:40 PM, Kern Sibbald wrote:
>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> From what I can see -- first "signal 0", and second this
>>>>>>>> traceback, this looks a lot like a FreeBSD pthreads bug.
>>>>>>>>
>>>>>>>> First because there is no such thing, at least in userland,
>>>>>>>> as a signal number 0, which I saw in an earlier
>>>>>>>> email.  Second, as the traceback below
>>>>>>>> shows, Bacula is waiting on a pthread_cond_timedwait() and
>>>>>>>> while in the pthread_cond_timedwait, which is a "system"
>>>>>>>> subroutine, it emits a pthread_cond_signal(), probably no
>>>>>>>> problem, followed by a pthread_kill().  That seems odd to
>>>>>>>> me, but perhaps it is how FreeBSD does it, but the net
>>>>>>>> result is that it is killing Bacula.
>>>>>>>>
>>>>>>>> Obviously, this could be a Bacula bug, but it is not occurring
>>>>>>>> elsewhere, and it looks very suspicious to me.
>>>>>>>>
>>>>>>>> You can get more information by compiling with
>>>>>>>> #define DEVELOPER 1
>>>>>>>> in <bacula>/src/version.h  and ensuring that the -g
>>>>>>>> option is on the compile and that the binaries are not
>>>>>>>> stripped (default for Bacula Makefiles, but not for the
>>>>>>>> FreeBSD ports system).
>>>>>>>>
>>>>>>>> Then if you get another traceback, it may be clearer what
>>>>>>>> is going on.  Since this is relatively serious, I would recommend
>>>>>>>> running Bacula under the debugger directly, see the manual on
>>>>>>>> the details of how, then when the debugger gets control after
>>>>>>>> the signal, manually do the "thread apply all bt" command.
>>>>>>>>
>>>>>>>> FreeBSD gurus, a little help?
>>>>>>>>
>>>>>>>> That's not me.
>>>>>>>>
>>>>>>>> I don't see version.h under the bacula-client port directory.
>>>>>>>>
>>>>>>>> try this:
>>>>>>>>
>>>>>>>> make clean
>>>>>>>> make patch
>>>>>>>> find . -name version.h
>>>>>>>> ./bacula-5.2.12/src/version.h
>>>>>>>>
>>>>>>>> OK, thanks. That works.
>>>>>>>>
>>>>>>>> Kern's email gave three steps. Sorry for the baby questions, but I
>>>>>>>> don't
>>>>>>>> know how to do steps 2 or 3, either.
>>>>>>>>
>>>>>>>> On 10/19/13 11:40 PM, Kern Sibbald wrote:
>>>>>>>>
>>>>>>>> You can get more information by compiling with
>>>>>>>> #define DEVELOPER 1
>>>>>>>> in <bacula>/src/version.h
>>>>>>>>
>>>>>>>> That's step 1, which you've helped me find.
>>>>>>>>
>>>>>>>> and ensuring that the -g
>>>>>>>> option is on the compile
>>>>>>>>
>>>>>>>> That's step 2.
>>>>>>>>
>>>>>>>> I don't see a place for that option in the Makefile.
>>>>>>>>
>>>>>>>> I think that goes on:
>>>>>>>>
>>>>>>>> CPPFLAGS+=
>>>>>>>>
>>>>>>>> to become:
>>>>>>>>
>>>>>>>> -I/usr/include/readline -I${LOCALBASE}/include -g
>>>>>>>>
>>>>>>>> I think.  I have not tested that.
>>>>>>>>
>>>>>>>> Which file would this go into?
>>>>>>>>
>>>>>>>> After 'make patch', running 'grep -R LOCALBASE *' from the root of the
>>>>>>>> port returns nothing.
>>>>>>>>
>>>>>>>> Make that change in /usr/ports/sysutils/bacula-server/Makefile
>>>>>>>>
>>>>>>>> Yes, bacula-server, not a typo.  bacula-client is s slave port of
>>>>>>>> bacula-server.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> and that the binaries are not
>>>>>>>> stripped (default for Bacula Makefiles, but not for the
>>>>>>>> FreeBSD ports system).
>>>>>>>>
>>>>>>>> Looking in /usr/ports/Mk/bsd.port.mk, I think you want WITH_DEBUG which
>>>>>>>> I think you can add to the OPTIONS_DEFINE line.
>>>>>>>>
>>>>>>>> What's the procedure here?
>>>>>>>>
>>>>>>>> Is it (1) to uncomment WITH_DEBUG in /usr/ports/Mk/bsd.port.mk; and
>>>>>>>>
>>>>>>>> (2) to change the Makefile to OPTIONS_DEFINE= NLS OPENSSL PYTHON
>>>>>>>> WITH_DEBUG
>>>>>>>>
>>>>>>>> Make those changes to OPTIONS_DEFINE 
>>>>>>>> /usr/ports/sysutils/bacula-server/Makefile as well.
>>>>>>>>
>>>>>>>> I suggest deleting all bacula packages on this client.  Then make
>>>>>>>> clean, and make install in the bacula-client dir.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ??
>>>>>>>>
>>>>>>>> thanks
>>>>>>>>
>>>>>>>> dn
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> That's step 3. Sorry, don't know how to do that either.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Also, I do have bacula-fd running fine on other FreeBSD 9.2 systems.
>>>>>>>> The
>>>>>>>> only delta AFAIK is that this is an i386 system and the others are
>>>>>>>> amd64.
>>>>>>>>
>>>>>>>> To review:
>>>>>>>>
>>>>>>>> 1. Backup jobs complete when manually starting bacula-fd.
>>>>>>>>
>>>>>>>> What command are you entering?
>>>>>>>>
>>>>>>>> /usr/local/sbin/bacula-fd
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2. Backup jobs do not complete when launching bacula-fd via the startup
>>>>>>>> script in /usr/local/etc/rc.d/bacula-fd.
>>>>>>>>
>>>>>>>> For example: usr/local/etc/rc.d/bacula-fd start ?
>>>>>>>>
>>>>>>>> Yes:
>>>>>>>>
>>>>>>>> /usr/local/etc/rc.d/bacula-fd start # note leading stroke
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>>
>>>>>>>> dn
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks in advance for further debugging clues.
>>>>>>>>
>>>>>>>> dn
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> If any of you are FreeBSD system gurus you might compare the
>>>>>>>> last known working version of the OS with 9.2, particularly the
>>>>>>>> pthreads routines.  Perhaps they are using a signal 0 internally,
>>>>>>>> and somehow that leaked back to Bacula.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Kern
>>>>>>>>
>>>>>>>> On 10/18/2013 01:29 AM, David Newman wrote:
>>>>>>>> On 10/17/13 5:33 AM, Martin Simmons wrote:
>>>>>>>> On Wed, 16 Oct 2013 12:13:26 -0700, David Newman said:
>>>>>>>> On 10/14/13 2:44 AM, Martin Simmons wrote:
>>>>>>>> On Sun, 13 Oct 2013 18:25:07 -0700, David Newman said:
>>>>>>>> On 10/9/13 4:41 PM, David Newman wrote:
>>>>>>>> FreeBSD 9.2-RELEASE, bacula-client-5.2.12_3 installed from ports
>>>>>>>>
>>>>>>>> Ever since upgrading this host to FreeBSD 9.2, bacula-fd crashes
>>>>>>>> as soon
>>>>>>>> as bacula-dir starts a backup job. The entry in /var/log/messages
>>>>>>>> is:
>>>>>>>>
>>>>>>>> Oct  9 16:25:50 o bacula-fd: Bacula interrupted by signal 0:
>>>>>>>> UNKNOWN SIGNAL
>>>>>>>>
>>>>>>>> Backups worked fine on this host running FreeBSD 9.1 and other hosts
>>>>>>>> upgraded to FreeBSD 9.2 run backups OK.
>>>>>>>>
>>>>>>>> I've done the uninstall/reinstall thing with the bacula-client
>>>>>>>> port, but
>>>>>>>> that made no difference.
>>>>>>>>
>>>>>>>> Thanks in advance for troubleshooting clues.
>>>>>>>>
>>>>>>>> dn
>>>>>>>> Is there a Wireshark decode for Bacula?
>>>>>>>>
>>>>>>>> I'm still stuck on this problem, and need more info on what's causing
>>>>>>>> that UNKNOWN SIGNAL error. Wireshark 1.8.6 just shows strings of
>>>>>>>> bytes
>>>>>>>> for the Bacula stuff.
>>>>>>>>
>>>>>>>> Thanks.
>>>>>>>>
>>>>>>>> dn
>>>>>>>> A wireshark decode won't help much here because problems like this
>>>>>>>> must be in
>>>>>>>> the fd itself.
>>>>>>>>
>>>>>>>> Try attaching gdb to the bacula-fd process and see if it catches the
>>>>>>>> mysterious signal (see
>>>>>>>> http://www.bacula.org/5.2.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000).
>>>>>>>>
>>>>>>>>
>>>>>>>> No luck with this. Per that URL, I've put the btraceback.gdb file in
>>>>>>>> the
>>>>>>>> same directory as the bacula-fd executable on the client (in this case,
>>>>>>>> /usr/local/sbin) and made the .gdb file executable.
>>>>>>>>
>>>>>>>> At run time it produces this error:
>>>>>>>>
>>>>>>>> /usr/local/sbin/btraceback.gdb:1: Error in sourced command file:
>>>>>>>> No symbol table is loaded.  Use the "file" command.
>>>>>>>>
>>>>>>>> That's problem 1. Problem 2 is that the syntax given for capturing
>>>>>>>> STDERR and STDOUT -- 2>\&1 -- doesn't work on either csh (root's
>>>>>>>> default
>>>>>>>> on FreeBSD) or bash.
>>>>>>>>
>>>>>>>> Any ideas on remedying either issue?
>>>>>>>> It looks like you missed the part after the # in the URL -- you don't
>>>>>>>> need the
>>>>>>>> btraceback.gdb file.
>>>>>>>>
>>>>>>>> The section I meant is called "Manually Running Bacula Under The
>>>>>>>> Debugger" on
>>>>>>>> that page (you'll have to adapt it for the bacula-fd).
>>>>>>>> Sorry for missing that.
>>>>>>>>
>>>>>>>> The backup runs fine under the debugger, including the backup job
>>>>>>>> beforehand, but not with the FreeBSD startup script in
>>>>>>>> /usr/local/etc/rc.d.
>>>>>>>>
>>>>>>>> I've pasted below the debugger output and the startup script.
>>>>>>>>
>>>>>>>> Thanks in advance for further troubleshooting clues.
>>>>>>>>
>>>>>>>> dn
>>>>>>>>
>>>>>>>>
>>>>>>>> ==========
>>>>>>>>
>>>>>>>> Successful run, via /usr/local/sbin/bacula-fd run via gdb:
>>>>>>>>
>>>>>>>> (gdb) thread apply all bt
>>>>>>>> Thread 5 (Thread 28c08b00 (LWP 100213/bacula-fd)):
>>>>>>>> #0  0x282302b3 in pthread_kill () from /lib/libthr.so.3
>>>>>>>> #1  0x2822f9b2 in pthread_kill () from /lib/libthr.so.3
>>>>>>>> #2  0x282328f9 in pthread_cond_signal () from /lib/libthr.so.3
>>>>>>>> #3  0x281f5d20 in bthread_cond_timedwait_p () from
>>>>>>>> /usr/local/lib/libbac.so.5
>>>>>>>> #4  0x281ef9b0 in watchdog_thread () from /usr/local/lib/libbac.so.5
>>>>>>>> #5  0x281f7167 in lmgr_thread_launcher () from
>>>>>>>> /usr/local/lib/libbac.so.5
>>>>>>>> #6  0x28227f3a in pthread_getprio () from /lib/libthr.so.3
>>>>>>>> #7  0x00000000 in ?? ()
>>>>>>>>
>>>>>>>> Thread 3 (Thread 28805e00 (LWP 100211/bacula-fd)):
>>>>>>>> #0  0x28624323 in nanosleep () from /lib/libc.so.7
>>>>>>>> #1  0x2822ad8b in nanosleep () from /lib/libthr.so.3
>>>>>>>> #2  0x281c1a90 in bmicrosleep () from /usr/local/lib/libbac.so.5
>>>>>>>> #3  0x281f7349 in check_deadlock () from /usr/local/lib/libbac.so.5
>>>>>>>> #4  0x28227f3a in pthread_getprio () from /lib/libthr.so.3
>>>>>>>> #5  0x00000000 in ?? ()
>>>>>>>>
>>>>>>>> Thread 2 (Thread 28804300 (LWP 100133/bacula-fd)):
>>>>>>>> #0  0x28646103 in select () from /lib/libc.so.7
>>>>>>>> #1  0x2822a960 in select () from /lib/libthr.so.3
>>>>>>>> #2  0x281c45a8 in bnet_thread_server () from /usr/local/lib/libbac.so.5
>>>>>>>> #3  0x0804f5c6 in main ()
>>>>>>>> #0  0x282302b3 in pthread_kill () from /lib/libthr.so.3
>>>>>>>>
>>>>>>>> ==========
>>>>>>>>
>>>>>>>> FreeBSD startup script:
>>>>>>>>
>>>>>>>> #!/bin/sh
>>>>>>>> #
>>>>>>>> # $FreeBSD: sysutils/bacula-server/files/bacula-fd.in 323275 2013-07-19
>>>>>>>> 09:44:58Z rm $
>>>>>>>> #
>>>>>>>> # PROVIDE: bacula_fd
>>>>>>>> # REQUIRE: DAEMON
>>>>>>>> # KEYWORD: shutdown
>>>>>>>> #
>>>>>>>> # Add the following lines to /etc/rc.conf.local or /etc/rc.conf
>>>>>>>> # to enable this service:
>>>>>>>> #
>>>>>>>> # bacula_fd_enable  (bool):  Set to NO by default.
>>>>>>>> #               Set it to YES to enable bacula_fd.
>>>>>>>> # bacula_fd_flags (params):  Set params used to start bacula_fd.
>>>>>>>> #
>>>>>>>>
>>>>>>>> . /etc/rc.subr
>>>>>>>>
>>>>>>>> name="bacula_fd"
>>>>>>>> rcvar=${name}_enable
>>>>>>>> command=/usr/local/sbin/bacula-fd
>>>>>>>>
>>>>>>>> load_rc_config $name
>>>>>>>>
>>>>>>>> : ${bacula_fd_enable="NO"}
>>>>>>>> : ${bacula_fd_flags=" -u root -g wheel -v -c
>>>>>>>> /usr/local/etc/bacula/bacula-fd.conf"}
>>>>>>>> : ${bacula_fd_pidfile="/var/run/bacula-fd.9102.pid"}
>>>>>>>>
>>>>>>>> pidfile="${bacula_fd_pidfile}"
>>>>>>>>
>>>>>>>> run_rc_command "$1"
>>>>>>>>
>>>>>>>> ==========
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>> ------------------------------------------------------------------------------
>> November Webinars for C, C++, Fortran Developers
>> Accelerate application performance with scalable programming models. Explore
>> techniques for threading, error checking, porting, and tuning. Get the most 
>> from the latest Intel processors and coprocessors. See abstracts and register
>> http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Bacula-users mailing list
>> Bacula-users AT lists.sourceforge DOT net
>> https://lists.sourceforge.net/lists/listinfo/bacula-users
>>
>
> ------------------------------------------------------------------------------
> Shape the Mobile Experience: Free Subscription
> Software experts and developers: Be at the forefront of tech innovation.
> Intel(R) Software Adrenaline delivers strategic insight and game-changing 
> conversations that shape the rapidly evolving mobile landscape. Sign up now. 
> http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>


------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users