Bacula-users

Re: [Bacula-users] bacula-fd crashes on FreeBSD 9.2

2013-11-25 13:46:39
Subject: Re: [Bacula-users] bacula-fd crashes on FreeBSD 9.2
From: Martin Simmons <martin AT lispworks DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Mon, 25 Nov 2013 18:43:02 GMT
Looks like a recently discovered bug in FreeBSD:

http://thread.gmane.org/gmane.os.freebsd.devel.hackers/51832

__Martin


>>>>> On Sun, 24 Nov 2013 17:36:50 -0800, David Newman said:
> 
> Apologies for top posting. Kern and Dan asked for more information on
> this issue awhile back, and I'd provided it (see below, or the list
> archives) in two messages on 10 November.
> 
> I'm OK for now by running from the binary, but a backup will crash when
> the binary is called from the FreeBSD startup script.
> 
> Thanks in advance for additional troubleshooting clues.
> 
> Also, should I instead file a FreeBSD PR for this?
> 
> dn
> 
> 
> 
> On 11/10/13, 5:12 PM, David Newman wrote:
> > On 11/10/13 12:09 PM, Dan Langille wrote:
> >>
> >> On Nov 10, 2013, at 2:02 PM, David Newman <dnewman AT networktest DOT com> 
> >> wrote:
> >>
> >>> On 11/9/13, 9:33 AM, Dan Langille wrote:
> >>>> On Nov 8, 2013, at 7:51 PM, David Newman <dnewman AT networktest DOT 
> >>>> com> wrote:
> >>>>
>>>>> On 11/7/13 6:17 AM, Dan Langille wrote:
> >>>>>> On 2013-11-06 21:05, David Newman wrote:
> >>>>>>> On 11/5/13 5:53 PM, Dan Langille wrote:
> >>>>>>> You are on 9.2-release.
> >>>>>>>
> >>>>>>> Have you run freebsd-update to get the latest security patches?
> >>>>>>>
> >>>>>>> Yes
> >>>>>>>
> >>>>>>>
> >>>>>>> Did you see the post by Dean E. Weimer today?
> >>>>>>>
> >>>>>>> The most recent post from Dean to this list (at least that I have) is
> >>>>>>> from 4 November at 2030 UTC, saying essentially that a complete 
> >>>>>>> rebuild
> >>>>>>> of the OS solved his problem.
> >>>>>>>
> >>>>>>> I'm hoping not to have to boil the ocean...
> >>>>>>>
> >>>>>>>
> >>>>>>> Second: read below.
> >>>>>>>
> >>>>>>> With your help (thanks), I got the debug version built and running.
> >>>>>>> Two things:
> >>>>>>>
> >>>>>>> 1. Just like before, the binary in /usr/local/sbin/bacula-fd runs fine
> >>>>>>> when launched on its own. By "runs" I mean the director successfully
> >>>>>>> completes a backup job.
> >>>>>>>
> >>>>>>> 2. Just like before, the binary in /usr/local/sbin/bacula-fd crashed
> >>>>>>> when called from the startup script in /usr/local/etc/rc.d/bacula-fd. 
> >>>>>>> By
> >>>>>>> "crashed" I mean the client machine's fd daemon dies during a backup 
> >>>>>>> job.
> >>>>>>
> >>>>>> Please paste the output of ps auwx | grep bacula-fd for both of the
> >>>>>> above scenarios.  I expect to see something like this:
> >>>>>>
> >>>>>> # ps auwx | grep bacula-fd
> >>>>>> root     1364  0.0  0.4 10156  4192  ??  Is    1:47PM   0:15.71
> >>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
> >>>>>> /usr/local/etc/bacula-fd.conf
> >>>>>
>>>>> Here's the binary by itself:
> >>>>>
>>>>> root  27754   0.0  0.3  18696  5548 ??  Ss    4:01PM    0:00.00
>>>>> /usr/local/sbin/bacula-fd
> >>>>
> >>>> That’s just bacula-fd raw, no parameters.  Hmmm.
> >>>>
> >>>>>
>>>>> and here it is called from /usr/local/etc/rc.d/bacula-fd (with debugging
>>>>> on):
> >>>>>
>>>>> root  28337   0.0  0.3  18696  5612 ??  Ss    4:15PM    0:00.00
>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
>>>>> /usr/local/etc/bacula/bacula-fd.conf
> >>>>>
>>>>> and once more, compiled with debug off:
>>>>> root  37399   0.0  0.3  18696  5608 ??  Ss    4:38PM    0:00.00
>>>>> /usr/local/sbin/bacula-fd -u root -g wheel -v -c
>>>>> /usr/local/etc/bacula/bacula-fd.conf
> >>>>
> >>>> Those are identical.  Good.
> >>>>
> >>>>>
> >>>>>
> >>>>>>>
> >>>>>>> I've pasted below the crash output to STDERR. Thanks in advance for 
> >>>>>>> more
> >>>>>>> troubleshooting clues.
> >>>>>>
> >>>>>> For the below, I think you have to find btraceback and get that
> >>>>>> installed to /usr/local/sbin/btraceback
> >>>>>
>>>>> OK, I found something: This is a problem related to bsmtp. (Again, this
>>>>> is on an i386 machine.)
> >>>>>
>>>>> With debugging on and btraceback in place, the debugger complains it
>>>>> can't find bstmp. I copy the bsmtp directory from under the
>>>>> bacula-client port into /usr/local/sbin and a backup produces a
>>>>> complaint about permissions.
> >>>>>
>>>>> I do 'chmod -R 777 /usr/local/bstmp' and -- lo and behold, the backup
>>>>> now runs with bacula-fd called from the startup script. There's no debug
>>>>> output because it works.
>>>>> Then I uninstall bacula-client and back out of all the debugging
>>>>> changes, both in the bacula-client and bacula-server directories, run
>>>>> 'make clean' in both directories, and reinstall bacula-client. I again
>>>>> put the bsmtp directory into /usr/local/sbin and again chmod 777 it.
> >>>>>
>>>>> Now, bacula-fd crashes same as before.
> >>>>
> >>>> Try the non-debug and debug versions like this, start them from the 
> >>>> command line.
> >>>>
> >>>> /usr/local/sbin/bacula-fd -f -u root -g wheel -c 
> >>>> /usr/local/etc/bacula/bacula-fd.conf
> >>>>
> >>>> The -f ensures bacula-fd will stay in the foreground.  Try the backup?  
> >>>> Any messages?
> >>>>
> >>>
> >>> Yes. I've pasted the output of both here:
> >>>
> >>> http://pastebin.com/iPEqYDUb
> >>>
> >>> This time, both debug and non-debug versions failed to complete a
> >>> backup. Not sure what changed from before.
> >>
> >> For starters, you’re running in the foreground… That may affect things.  I 
> >> do not know for sure.
> >>
> >>> This is the only instance I have of bacula-fd on i386 on FreeBSD. There
> >>> are other i386 machines running bacula-fd but they're on other OSs such
> >>> as Linux and OpenBSD. The fact that the backup runs OK from the raw
> >>> binary and doesn't run from the startup script suggests this may be a
> >>> FreeBSD-specific issue.
> >>
> >> I wonder if this might be a compiler optimization issue.  That is, the 
> >> compiler is attempting
> >> to implement an optimization but optimization causes the problem in 
> >> question.  I’m out of time today, or I’d look into that now.
> > 
> > The bacula-fd binary by itself works OK. The problem occurs when it's
> > called from the startup script.
> > 
> > I don't see anything in particular in that script that would cause
> > different behavior in the binary, but in case it wasn't abundantly clear
> > by now, I'm not an expert on bacula internals.
> > 
> > Thanks in advance for more troubleshooting clues.
> > 
> > dn
> > 
> > 
> >>>
> >>> dn
> >>>
> >>>> No?  Try with -d 9
> >>>>
> >>>> /usr/local/sbin/bacula-fd -f -9d -u root -g wheel -c 
> >>>> /usr/local/etc/bacula/bacula-fd.conf
> >>>>
> >>>>>
>>>>> So the situation is that backups to this machine work if the binary runs
>>>>> without being called from the startup script *or* if the binary is
>>>>> compiled with debugging flags.
> >>>>>
>>>>> How to remedy?
> >>>>
> >>>> I wish I knew what was different about this system from all your others
> >>>> which work.  Is this the only i386?
> >>>>
> >>>>>
> >>>>>
> >>>>>> When you compiled, if you didn't do a make clean, it should be 
> >>>>>> somewhere
> >>>>>> under the work directory in /usr/ports/sysutils/bacula-client
> >>>>>>
> >>>>>>>
> >>>>>>> dn
> >>>>>>>
> >>>>>>>
> >>>>>>> root@o:/usr/ports/sysutils/bacula-client # 
> >>>>>>> /usr/local/etc/rc.d/bacula-fd
> >>>>>>> start
>>>>>   >> Starting bacula_fd.
> >>>>>>> root@o:/usr/ports/sysutils/bacula-client # Bacula interrupted by 
> >>>>>>> signal
> >>>>>>> 0: UNKNOWN SIGNAL
> >>>>>>> Kaboom! bacula-fd, o-fd got signal 0 - UNKNOWN SIGNAL. Attempting
> >>>>>>> traceback.
> >>>>>>> Kaboom! exepath=/usr/local/sbin/
> >>>>>>> Calling: /usr/local/sbin/btraceback /usr/local/sbin/bacula-fd 21541
> >>>>>>> /var/db/bacula
> >>>>>>> execv: /usr/local/sbin/btraceback failed: ERR=No such file or 
> >>>>>>> directory
> >>>>>>> It looks like the traceback worked ...
> >>>>>>> Dumping: /var/db/bacula/o-fd.21541.bactrace
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 28804618 from
> >>>>>>> bnet.c:774
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2885c2d8 from
> >>>>>>> jcr.c:358
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 28832a98 from
> >>>>>>> bnet.c:767
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 28a69018 from
> >>>>>>> bnet.c:773
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 7 bytes at 2880dcd8 from
> >>>>>>> bnet.c:775
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 28831508 from
> >>>>>>> bnet.c:776
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 8 bytes at 28831538 from
> >>>>>>> workq.c:162
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 16 bytes at 28831568 from
> >>>>>>> jcr.c:347
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29008218 from
> >>>>>>> jcr.c:360
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c158 from
> >>>>>>> jcr.c:362
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 536 bytes at 29008518 from
> >>>>>>> find.c:63
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c298 from
> >>>>>>> find.c:66
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030198 from
> >>>>>>> job.c:248
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905c3d8 from
> >>>>>>> job.c:249
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 21 bytes at 29066058 from
> >>>>>>> job.c:251
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069038 from
> >>>>>>> tls.c:422
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 40 bytes at 29067078 from
> >>>>>>> job.c:1736
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 56 bytes at 2906a1d8 from
> >>>>>>> job.c:803
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a238 from
> >>>>>>> job.c:933
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069098 from
> >>>>>>> alist.c:51
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 352 bytes at 29640318 from
> >>>>>>> job.c:968
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069df8 from
> >>>>>>> alist.c:51
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066088 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 17 bytes at 290660e8 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 30 bytes at 290301d8 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 23 bytes at 29066118 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066148 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066178 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661a8 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 36 bytes at 29030258 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 64 bytes at 2906a298 from
> >>>>>>> job.c:916
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069db8 from
> >>>>>>> alist.c:51
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 14 bytes at 290661d8 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 13 bytes at 29066208 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 18 bytes at 29066238 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 29066268 from
> >>>>>>> dlist.c:356
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 148 bytes at 29004318 from
> >>>>>>> bsock.c:64
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4112 bytes at 2978d018 from
> >>>>>>> bsock.c:73
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 528 bytes at 29009d18 from
> >>>>>>> bsock.c:74
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 15 bytes at 290662c8 from
> >>>>>>> bsock.c:159
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 29030398 from
> >>>>>>> bsock.c:160
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 4 bytes at 29069d38 from
> >>>>>>> tls.c:422
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 77 bytes at 2906d1e8 from
> >>>>>>> job.c:572
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 32 bytes at 29030318 from
> >>>>>>> runscript.c:51
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 272 bytes at 2905d418 from
> >>>>>>> runscript.c:203
> >>>>>>> o-fd: smartall.c:404 Orphaned buffer: o-fd 24 bytes at 290303d8 from
> >>>>>>> bpipe.c:76
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On Nov 4, 2013, at 5:44 PM, David Newman <dnewman AT networktest DOT 
> >>>>>>> com> wrote:
> >>>>>>>
> >>>>>>> On 10/29/13 12:42 PM, Dan Langille wrote:
> >>>>>>> On 2013-10-27 19:33, David Newman wrote:
> >>>>>>> On 10/27/13 11:31 AM, Dan Langille wrote:
> >>>>>>>
> >>>>>>> On Oct 22, 2013, at 3:00 PM, David Newman wrote:
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> On 10/19/13 11:40 PM, Kern Sibbald wrote:
> >>>>>>> Hello,
> >>>>>>>
> >>>>>>> From what I can see -- first "signal 0", and second this
> >>>>>>> traceback, this looks a lot like a FreeBSD pthreads bug.
> >>>>>>>
> >>>>>>> First because there is no such thing, at least in userland,
> >>>>>>> as a signal number 0, which I saw in an earlier
> >>>>>>> email.  Second, as the traceback below
> >>>>>>> shows, Bacula is waiting on a pthread_cond_timedwait() and
> >>>>>>> while in the pthread_cond_timedwait, which is a "system"
> >>>>>>> subroutine, it emits a pthread_cond_signal(), probably no
> >>>>>>> problem, followed by a pthread_kill().  That seems odd to
> >>>>>>> me, but perhaps it is how FreeBSD does it, but the net
> >>>>>>> result is that it is killing Bacula.
> >>>>>>>
> >>>>>>> Obviously, this could be a Bacula bug, but it is not occurring
> >>>>>>> elsewhere, and it looks very suspicious to me.
> >>>>>>>
> >>>>>>> You can get more information by compiling with
> >>>>>>> #define DEVELOPER 1
> >>>>>>> in <bacula>/src/version.h  and ensuring that the -g
> >>>>>>> option is on the compile and that the binaries are not
> >>>>>>> stripped (default for Bacula Makefiles, but not for the
> >>>>>>> FreeBSD ports system).
> >>>>>>>
> >>>>>>> Then if you get another traceback, it may be clearer what
> >>>>>>> is going on.  Since this is relatively serious, I would recommend
> >>>>>>> running Bacula under the debugger directly, see the manual on
> >>>>>>> the details of how, then when the debugger gets control after
> >>>>>>> the signal, manually do the "thread apply all bt" command.
> >>>>>>>
> >>>>>>> FreeBSD gurus, a little help?
> >>>>>>>
> >>>>>>> That's not me.
> >>>>>>>
> >>>>>>> I don't see version.h under the bacula-client port directory.
> >>>>>>>
> >>>>>>> try this:
> >>>>>>>
> >>>>>>> make clean
> >>>>>>> make patch
> >>>>>>> find . -name version.h
> >>>>>>> ./bacula-5.2.12/src/version.h
> >>>>>>>
> >>>>>>> OK, thanks. That works.
> >>>>>>>
> >>>>>>> Kern's email gave three steps. Sorry for the baby questions, but I
> >>>>>>> don't
> >>>>>>> know how to do steps 2 or 3, either.
> >>>>>>>
> >>>>>>> On 10/19/13 11:40 PM, Kern Sibbald wrote:
> >>>>>>>
> >>>>>>> You can get more information by compiling with
> >>>>>>> #define DEVELOPER 1
> >>>>>>> in <bacula>/src/version.h
> >>>>>>>
> >>>>>>> That's step 1, which you've helped me find.
> >>>>>>>
> >>>>>>> and ensuring that the -g
> >>>>>>> option is on the compile
> >>>>>>>
> >>>>>>> That's step 2.
> >>>>>>>
> >>>>>>> I don't see a place for that option in the Makefile.
> >>>>>>>
> >>>>>>> I think that goes on:
> >>>>>>>
> >>>>>>> CPPFLAGS+=
> >>>>>>>
> >>>>>>> to become:
> >>>>>>>
> >>>>>>> -I/usr/include/readline -I${LOCALBASE}/include -g
> >>>>>>>
> >>>>>>> I think.  I have not tested that.
> >>>>>>>
> >>>>>>> Which file would this go into?
> >>>>>>>
> >>>>>>> After 'make patch', running 'grep -R LOCALBASE *' from the root of the
> >>>>>>> port returns nothing.
> >>>>>>>
> >>>>>>> Make that change in /usr/ports/sysutils/bacula-server/Makefile
> >>>>>>>
> >>>>>>> Yes, bacula-server, not a typo.  bacula-client is s slave port of
> >>>>>>> bacula-server.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> and that the binaries are not
> >>>>>>> stripped (default for Bacula Makefiles, but not for the
> >>>>>>> FreeBSD ports system).
> >>>>>>>
> >>>>>>> Looking in /usr/ports/Mk/bsd.port.mk, I think you want WITH_DEBUG 
> >>>>>>> which
> >>>>>>> I think you can add to the OPTIONS_DEFINE line.
> >>>>>>>
> >>>>>>> What's the procedure here?
> >>>>>>>
> >>>>>>> Is it (1) to uncomment WITH_DEBUG in /usr/ports/Mk/bsd.port.mk; and
> >>>>>>>
> >>>>>>> (2) to change the Makefile to OPTIONS_DEFINE= NLS OPENSSL PYTHON
> >>>>>>> WITH_DEBUG
> >>>>>>>
> >>>>>>> Make those changes to OPTIONS_DEFINE 
> >>>>>>> /usr/ports/sysutils/bacula-server/Makefile as well.
> >>>>>>>
> >>>>>>> I suggest deleting all bacula packages on this client.  Then make
> >>>>>>> clean, and make install in the bacula-client dir.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> ??
> >>>>>>>
> >>>>>>> thanks
> >>>>>>>
> >>>>>>> dn
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> That's step 3. Sorry, don't know how to do that either.
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Also, I do have bacula-fd running fine on other FreeBSD 9.2 systems.
> >>>>>>> The
> >>>>>>> only delta AFAIK is that this is an i386 system and the others are
> >>>>>>> amd64.
> >>>>>>>
> >>>>>>> To review:
> >>>>>>>
> >>>>>>> 1. Backup jobs complete when manually starting bacula-fd.
> >>>>>>>
> >>>>>>> What command are you entering?
> >>>>>>>
> >>>>>>> /usr/local/sbin/bacula-fd
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> 2. Backup jobs do not complete when launching bacula-fd via the 
> >>>>>>> startup
> >>>>>>> script in /usr/local/etc/rc.d/bacula-fd.
> >>>>>>>
> >>>>>>> For example: usr/local/etc/rc.d/bacula-fd start ?
> >>>>>>>
> >>>>>>> Yes:
> >>>>>>>
> >>>>>>> /usr/local/etc/rc.d/bacula-fd start # note leading stroke
> >>>>>>>
> >>>>>>> Thanks
> >>>>>>>
> >>>>>>> dn
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks in advance for further debugging clues.
> >>>>>>>
> >>>>>>> dn
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> If any of you are FreeBSD system gurus you might compare the
> >>>>>>> last known working version of the OS with 9.2, particularly the
> >>>>>>> pthreads routines.  Perhaps they are using a signal 0 internally,
> >>>>>>> and somehow that leaked back to Bacula.
> >>>>>>>
> >>>>>>> Best regards,
> >>>>>>> Kern
> >>>>>>>
> >>>>>>> On 10/18/2013 01:29 AM, David Newman wrote:
> >>>>>>> On 10/17/13 5:33 AM, Martin Simmons wrote:
> >>>>>>> On Wed, 16 Oct 2013 12:13:26 -0700, David Newman said:
> >>>>>>> On 10/14/13 2:44 AM, Martin Simmons wrote:
> >>>>>>> On Sun, 13 Oct 2013 18:25:07 -0700, David Newman said:
> >>>>>>> On 10/9/13 4:41 PM, David Newman wrote:
> >>>>>>> FreeBSD 9.2-RELEASE, bacula-client-5.2.12_3 installed from ports
> >>>>>>>
> >>>>>>> Ever since upgrading this host to FreeBSD 9.2, bacula-fd crashes
> >>>>>>> as soon
> >>>>>>> as bacula-dir starts a backup job. The entry in /var/log/messages
> >>>>>>> is:
> >>>>>>>
> >>>>>>> Oct  9 16:25:50 o bacula-fd: Bacula interrupted by signal 0:
> >>>>>>> UNKNOWN SIGNAL
> >>>>>>>
> >>>>>>> Backups worked fine on this host running FreeBSD 9.1 and other hosts
> >>>>>>> upgraded to FreeBSD 9.2 run backups OK.
> >>>>>>>
> >>>>>>> I've done the uninstall/reinstall thing with the bacula-client
> >>>>>>> port, but
> >>>>>>> that made no difference.
> >>>>>>>
> >>>>>>> Thanks in advance for troubleshooting clues.
> >>>>>>>
> >>>>>>> dn
> >>>>>>> Is there a Wireshark decode for Bacula?
> >>>>>>>
> >>>>>>> I'm still stuck on this problem, and need more info on what's causing
> >>>>>>> that UNKNOWN SIGNAL error. Wireshark 1.8.6 just shows strings of
> >>>>>>> bytes
> >>>>>>> for the Bacula stuff.
> >>>>>>>
> >>>>>>> Thanks.
> >>>>>>>
> >>>>>>> dn
> >>>>>>> A wireshark decode won't help much here because problems like this
> >>>>>>> must be in
> >>>>>>> the fd itself.
> >>>>>>>
> >>>>>>> Try attaching gdb to the bacula-fd process and see if it catches the
> >>>>>>> mysterious signal (see
> >>>>>>> http://www.bacula.org/5.2.x-manuals/en/problems/problems/What_Do_When_Bacula.html#SECTION00640000000000000000).
> >>>>>>>
> >>>>>>>
> >>>>>>> No luck with this. Per that URL, I've put the btraceback.gdb file in
> >>>>>>> the
> >>>>>>> same directory as the bacula-fd executable on the client (in this 
> >>>>>>> case,
> >>>>>>> /usr/local/sbin) and made the .gdb file executable.
> >>>>>>>
> >>>>>>> At run time it produces this error:
> >>>>>>>
> >>>>>>> /usr/local/sbin/btraceback.gdb:1: Error in sourced command file:
> >>>>>>> No symbol table is loaded.  Use the "file" command.
> >>>>>>>
> >>>>>>> That's problem 1. Problem 2 is that the syntax given for capturing
> >>>>>>> STDERR and STDOUT -- 2>\&1 -- doesn't work on either csh (root's
> >>>>>>> default
> >>>>>>> on FreeBSD) or bash.
> >>>>>>>
> >>>>>>> Any ideas on remedying either issue?
> >>>>>>> It looks like you missed the part after the # in the URL -- you don't
> >>>>>>> need the
> >>>>>>> btraceback.gdb file.
> >>>>>>>
> >>>>>>> The section I meant is called "Manually Running Bacula Under The
> >>>>>>> Debugger" on
> >>>>>>> that page (you'll have to adapt it for the bacula-fd).
> >>>>>>> Sorry for missing that.
> >>>>>>>
> >>>>>>> The backup runs fine under the debugger, including the backup job
> >>>>>>> beforehand, but not with the FreeBSD startup script in
> >>>>>>> /usr/local/etc/rc.d.
> >>>>>>>
> >>>>>>> I've pasted below the debugger output and the startup script.
> >>>>>>>
> >>>>>>> Thanks in advance for further troubleshooting clues.
> >>>>>>>
> >>>>>>> dn
> >>>>>>>
> >>>>>>>
> >>>>>>> ==========
> >>>>>>>
> >>>>>>> Successful run, via /usr/local/sbin/bacula-fd run via gdb:
> >>>>>>>
> >>>>>>> (gdb) thread apply all bt
> >>>>>>> Thread 5 (Thread 28c08b00 (LWP 100213/bacula-fd)):
> >>>>>>> #0  0x282302b3 in pthread_kill () from /lib/libthr.so.3
> >>>>>>> #1  0x2822f9b2 in pthread_kill () from /lib/libthr.so.3
> >>>>>>> #2  0x282328f9 in pthread_cond_signal () from /lib/libthr.so.3
> >>>>>>> #3  0x281f5d20 in bthread_cond_timedwait_p () from
> >>>>>>> /usr/local/lib/libbac.so.5
> >>>>>>> #4  0x281ef9b0 in watchdog_thread () from /usr/local/lib/libbac.so.5
> >>>>>>> #5  0x281f7167 in lmgr_thread_launcher () from
> >>>>>>> /usr/local/lib/libbac.so.5
> >>>>>>> #6  0x28227f3a in pthread_getprio () from /lib/libthr.so.3
> >>>>>>> #7  0x00000000 in ?? ()
> >>>>>>>
> >>>>>>> Thread 3 (Thread 28805e00 (LWP 100211/bacula-fd)):
> >>>>>>> #0  0x28624323 in nanosleep () from /lib/libc.so.7
> >>>>>>> #1  0x2822ad8b in nanosleep () from /lib/libthr.so.3
> >>>>>>> #2  0x281c1a90 in bmicrosleep () from /usr/local/lib/libbac.so.5
> >>>>>>> #3  0x281f7349 in check_deadlock () from /usr/local/lib/libbac.so.5
> >>>>>>> #4  0x28227f3a in pthread_getprio () from /lib/libthr.so.3
> >>>>>>> #5  0x00000000 in ?? ()
> >>>>>>>
> >>>>>>> Thread 2 (Thread 28804300 (LWP 100133/bacula-fd)):
> >>>>>>> #0  0x28646103 in select () from /lib/libc.so.7
> >>>>>>> #1  0x2822a960 in select () from /lib/libthr.so.3
> >>>>>>> #2  0x281c45a8 in bnet_thread_server () from 
> >>>>>>> /usr/local/lib/libbac.so.5
> >>>>>>> #3  0x0804f5c6 in main ()
> >>>>>>> #0  0x282302b3 in pthread_kill () from /lib/libthr.so.3
> >>>>>>>
> >>>>>>> ==========
> >>>>>>>
> >>>>>>> FreeBSD startup script:
> >>>>>>>
> >>>>>>> #!/bin/sh
> >>>>>>> #
> >>>>>>> # $FreeBSD: sysutils/bacula-server/files/bacula-fd.in 323275 
> >>>>>>> 2013-07-19
> >>>>>>> 09:44:58Z rm $
> >>>>>>> #
> >>>>>>> # PROVIDE: bacula_fd
> >>>>>>> # REQUIRE: DAEMON
> >>>>>>> # KEYWORD: shutdown
> >>>>>>> #
> >>>>>>> # Add the following lines to /etc/rc.conf.local or /etc/rc.conf
> >>>>>>> # to enable this service:
> >>>>>>> #
> >>>>>>> # bacula_fd_enable  (bool):  Set to NO by default.
> >>>>>>> #               Set it to YES to enable bacula_fd.
> >>>>>>> # bacula_fd_flags (params):  Set params used to start bacula_fd.
> >>>>>>> #
> >>>>>>>
> >>>>>>> . /etc/rc.subr
> >>>>>>>
> >>>>>>> name="bacula_fd"
> >>>>>>> rcvar=${name}_enable
> >>>>>>> command=/usr/local/sbin/bacula-fd
> >>>>>>>
> >>>>>>> load_rc_config $name
> >>>>>>>
> >>>>>>> : ${bacula_fd_enable="NO"}
> >>>>>>> : ${bacula_fd_flags=" -u root -g wheel -v -c
> >>>>>>> /usr/local/etc/bacula/bacula-fd.conf"}
> >>>>>>> : ${bacula_fd_pidfile="/var/run/bacula-fd.9102.pid"}
> >>>>>>>
> >>>>>>> pidfile="${bacula_fd_pidfile}"
> >>>>>>>
> >>>>>>> run_rc_command "$1"
> >>>>>>>
> >>>>>>> ==========
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>
> > 
> > ------------------------------------------------------------------------------
> > November Webinars for C, C++, Fortran Developers
> > Accelerate application performance with scalable programming models. Explore
> > techniques for threading, error checking, porting, and tuning. Get the most 
> > from the latest Intel processors and coprocessors. See abstracts and 
> > register
> > http://pubads.g.doubleclick.net/gampad/clk?id=60136231&iu=/4140/ostg.clktrk
> > _______________________________________________
> > Bacula-users mailing list
> > Bacula-users AT lists.sourceforge DOT net
> > https://lists.sourceforge.net/lists/listinfo/bacula-users
> > 
> 
> 
> ------------------------------------------------------------------------------
> Shape the Mobile Experience: Free Subscription
> Software experts and developers: Be at the forefront of tech innovation.
> Intel(R) Software Adrenaline delivers strategic insight and game-changing 
> conversations that shape the rapidly evolving mobile landscape. Sign up now. 
> http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

------------------------------------------------------------------------------
Shape the Mobile Experience: Free Subscription
Software experts and developers: Be at the forefront of tech innovation.
Intel(R) Software Adrenaline delivers strategic insight and game-changing 
conversations that shape the rapidly evolving mobile landscape. Sign up now. 
http://pubads.g.doubleclick.net/gampad/clk?id=63431311&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users