Bacula-users

[Bacula-users] hang immediately after "Start UA server"

2009-10-10 14:05:18
Subject: [Bacula-users] hang immediately after "Start UA server"
From: Jo Rhett <jrhett AT netconsonance DOT com>
To: bacula-users <bacula-users AT lists.sourceforge DOT net>
Date: Sat, 10 Oct 2009 11:00:40 -0700
I've just upgraded a machine from FreeBSD 6.3 to 7.2.   I replaced all  
the ports with new versions compiled on 7.2, and everything is working  
normally (just like every other server running these builds) except  
for bacula-dir.   It is hanging right after starting the UA server,  
and before it starts accepting network connections.  From my reading  
it is hanging in the _umtx_op() call.  No core, no log message,  
nothing -- except that you have to kill -9 the process.

I found one other report about this in the archives but they said  
updating gettext fixed it.  I recompiled those to be sure.  I even  
recompiled bacula with NLS disabled so that gettext wasn't linked it,  
and the problem doesn't change.   I'm making no headway on this, and  
would appreciate some direction on other things to test/debug:

/usr/local/sbin/bacula-dir -d300 -f -v
bacula-dir: dird.c:184-0 Debug level = 300
bacula-dir: runscript.c:296-0 runscript: debug
bacula-dir: runscript.c:297-0  --> RunScript
bacula-dir: runscript.c:298-0   --> Command=/usr/local/share/bacula/ 
make_catalog_backup bacula bacula *snip* localhost
bacula-dir: runscript.c:299-0   --> Target=
bacula-dir: runscript.c:300-0   --> RunOnSuccess=1
bacula-dir: runscript.c:301-0   --> RunOnFailure=0
bacula-dir: runscript.c:302-0   --> FailJobOnError=1
bacula-dir: runscript.c:303-0   --> RunWhen=2
bacula-dir: runscript.c:296-0 runscript: debug
bacula-dir: runscript.c:297-0  --> RunScript
bacula-dir: runscript.c:298-0   --> Command=/usr/local/share/bacula/ 
delete_catalog_backup
bacula-dir: runscript.c:299-0   --> Target=
bacula-dir: runscript.c:300-0   --> RunOnSuccess=1
bacula-dir: runscript.c:301-0   --> RunOnFailure=0
bacula-dir: runscript.c:302-0   --> FailJobOnError=1
bacula-dir: runscript.c:303-0   --> RunWhen=1
bacula-dir: message.c:263-0 Copy message resource 2870f1b8 to 28714698
bacula-dir: bsys.c:503-0 Could not open state file. sfd=-1 size=188:  
ERR=No such file or directory
bacula-dir: mysql.c:101-0 db_open first time
bacula-dir: mysql.c:130-0 initdb ref=1 connected=0 db=0
bacula-dir: mysql.c:166-0 mysql_init done
bacula-dir: mysql.c:187-0 mysql_real_connect done
bacula-dir: mysql.c:189-0 db_user=bacula db_name=bacula  
db_password=*snip*
bacula-dir: mysql.c:215-0 opendb ref=1 connected=1 db=28708044
bacula-dir: sql_create.c:341-0 In create mediatype
bacula-dir: sql_create.c:344-0 selectmediatype: SELECT  
MediaTypeId,MediaType FROM MediaType WHERE MediaType='File_SVcolo'
bacula-dir: mysql.c:236-0 closedb ref=0 connected=1 db=28708044
bacula-dir: mysql.c:240-0 close db=28708044
backup0-dir: dird.c:317-0 Start UA server

FWIW, it's not the state file error.   I didn't used to get that error  
until I removed all files trying to see if something in the  
environment was confusing it.   Exact same process, same hang in the  
same place whether the state file was there or not.

Here is the ktrace (similar to strace on linux) output near the  
failure.  From my reading it is hanging in the _umtx_op() call.

91887 bacula-dir GIO   fd 1 wrote 44 bytes
       "bacula-dir: mysql.c:240-0 close db=28708044
       "
91887 bacula-dir RET   write 44/0x2c
91887 bacula-dir CALL  write(0x4,0x28763000,0x5)
91887 bacula-dir GIO   fd 4 wrote 5 bytes
       0x0000 0100 0000  
01                                                                      
|.....|

91887 bacula-dir RET   write 5
91887 bacula-dir CALL  shutdown(0x4,<invalid=2>)
91887 bacula-dir RET   shutdown 0
91887 bacula-dir CALL  close(0x4)
91887 bacula-dir RET   close 0
91887 bacula-dir CALL  __sysctl(0xbfbfe88c, 
0x2,0x2815eea0,0xbfbfe8a4,0,0)
91887 bacula-dir RET   __sysctl 0
91887 bacula-dir CALL  sigaction(SIGHUP,0xbfbfecb4,0xbfbfec9c)
91887 bacula-dir RET   sigaction 0
91887 bacula-dir CALL  open(0x2815eee0,O_RDWR|O_CREAT,S_IRUSR|S_IWUSR)
91887 bacula-dir NAMI  "/var/db/bacula/backup0-dir.conmsg"
91887 bacula-dir RET   open 4
91887 bacula-dir CALL  lseek(0x4,0,SEEK_SET,0x2)
91887 bacula-dir RET   lseek 0
91887 bacula-dir CALL  close(0x4)
91887 bacula-dir RET   close 0
91887 bacula-dir CALL  open(0x2815eee0,O_RDWR|O_APPEND|O_CREAT,S_IRUSR| 
S_IWUSR|S_IRGRP|S_IWGRP|S_IROTH|S_IWOTH)
91887 bacula-dir NAMI  "/var/db/bacula/backup0-dir.conmsg"
91887 bacula-dir RET   open 4
91887 bacula-dir CALL  lseek(0x4,0,SEEK_SET,0x2)
91887 bacula-dir RET   lseek 0
91887 bacula-dir CALL  write(0x1,0x28711000,0x2a)
91887 bacula-dir GIO   fd 1 wrote 42 bytes
       "backup0-dir: dird.c:317-0 Start UA server
       "
91887 bacula-dir RET   write 42/0x2a
91887 bacula-dir CALL  _umtx_op(0xbfbfebd0,0x3,0x1,0,0)
91887 bacula-dir RET   _umtx_op 0
91887 bacula-dir CALL  sigprocmask(SIG_BLOCK,0xbfbfeb74,0x287010d8)
91887 bacula-dir RET   sigprocmask 0
91887 bacula-dir CALL  sigprocmask(SIG_SETMASK,0x287010d8,0)
91887 bacula-dir RET   sigprocmask 0
91887 bacula-dir CALL  _umtx_op(0x281daa80,0x11,0,0,0)
91887 bacula-dir RET   _umtx_op -1 errno 4 Interrupted system call
91887 bacula-dir PSIG  SIGINT SIG_DFL

Machine: Rackable 3U with single-core Athlon
CPU: AMD Opteron(tm) Processor 244 (1804.10-MHz 686-class CPU)
  Origin = "AuthenticAMD"  Id = 0xf5a  Stepping = 10
Features 
= 
0x78bfbff 
< 
FPU 
,VME 
,DE 
,PSE 
,TSC 
,MSR 
,PAE 
,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2>
  AMD Features=0xe0500800<SYSCALL,NX,MMX+,LM,3DNow!+,3DNow!>
real memory  = 2146828288 (2047 MB)

This exact machine and hardware have been running FreeBSD 6.x and  
Bacula for >2 years now, zero problems.

-- 
Jo Rhett
Net Consonance : consonant endings by net philanthropy, open source  
and other randomness


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users