Bacula-users

Re: [Bacula-users] Bacula 3.0.3 deadlock : Job is waiting for execution

2010-01-08 15:38:37
Subject: Re: [Bacula-users] Bacula 3.0.3 deadlock : Job is waiting for execution
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-devel <bacula-devel AT lists.sourceforge DOT net>
Date: Fri, 08 Jan 2010 21:32:18 +0100
Hello,

this is just forwarding your mail to bacula-devel, where it's more 
likely to be picked up, looked at, and perhaps integrated into the 
code base :-)

Cheers, and thanks for not only analyzing the problem, but also 
providing a possible fix!

Arno

07.01.2010 16:34, Renaud Marquet wrote:
> Hi,
> 
> I'm using bacula 3.0.3 and the director's job queue was stuck after
> running the first job. The others were waiting indefinitely for
> execution. If the director was restarted, I could run only one job, and
> so on.
> 
> Googling around I found these 2 posts without satisfying anwsers :
> http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-mailing-lists-3/bacula-25/upgrade-to-3-0-3-job-is-waiting-for-execution-102156/
> http://www.backupcentral.com/phpBB2/two-way-mirrors-of-external-mailing-lists-3/bacula-25/job-is-waiting-for-execuition-101508/
> 
> I then looked at the code and found there is a deadlock happening in
> message handling.
> 
> The problem is located in close_msg(JCR *) function in message.c. When
> it encounters an error while sending an e-mail, it calls the macro Jmsg1
> (line 485) to report it. This macro calls dispatch_message, which tries
> to acquire fides_mutex (line 738). Unfortunatly, this mutex was already
> acquired in close_msg (line 431), thus resulting in a deadlock (as
> stated in mutex documentation for PTHREAD_MUTEX_INITIALIZER kind).
> 
> This problem was affecting me because mail daemon was not properly
> configured on my server.
> 
> It could be interesting to review these parts of the code to avoid such
> situation.
> 
> However I wrote a quick patch for lockmgr.c which simply upgrades
> mutexes to PTHREAD_MUTEX_ERRORCHECK_NP kind and resolves this error.
> 
> Hope this would help someone,
> Renaud
> 
> patch :
> 
> diff -rupN bacula-3.0.3.vanilla/src/lib/lockmgr.c
> bacula-3.0.3.patched/src/lib/lockmgr.c
> --- bacula-3.0.3.vanilla/src/lib/lockmgr.c    2009-10-18 11:10:16.000000000
> +0200
> +++ bacula-3.0.3.patched/src/lib/lockmgr.c    2009-12-31 18:05:59.000000000
> +0100
> @@ -616,6 +616,15 @@ void lmgr_cleanup_main()
>   */
>  int lmgr_mutex_lock(pthread_mutex_t *m, const char *file, int line)
>  {
> +   /* Patch to avoid deadlock if mutex is locked more than once */
> +   /* There's some performance hit which makes it probably not
> acceptable */
> +   /* for large system usage. */   
> +   if(*m == PTHREAD_MUTEX_INITIALIZER) {
> +      pthread_mutexattr_t attr;
> +      pthread_mutexattr_settype( &attr, PTHREAD_MUTEX_ERRORCHECK_NP );
> +      pthread_mutex_init( m, &attr );
> +   }
> +
>     int ret;
>     lmgr_thread_t *self = lmgr_get_thread_info();
>     self->pre_P(m, file, line);
> 
> 
> 
> ------------------------------------------------------------------------------
> This SF.Net email is sponsored by the Verizon Developer Community
> Take advantage of Verizon's best-in-class app development support
> A streamlined, 14 day to market process makes app distribution fast and easy
> Join now and get one step closer to millions of Verizon customers
> http://p.sf.net/sfu/verizon-dev2dev 
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users