Bacula-users

Re: [Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.

2009-12-18 03:24:39
Subject: Re: [Bacula-users] bacula-dir 3.0.3 dies on second job run or manual reload.
From: Janusz Syrytczyk <jsyrytczyk AT uni.opole DOT pl>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 18 Dec 2009 09:20:03 +0100
On Monday 14 December 2009 09:27:40 Bruno Friedmann wrote:
> On 12/09/2009 12:11 AM, Janusz Syrytczyk wrote:
> > Hi,
> >
> > I've upgraded to 3.0.3  from 3.0.2 a while ago and I'm facing serious
> > problems with bacula-dir stability.
> >
> > Just after its start,  Director is able to perform any request I have
> > (perform a backup, restore, reload etc.). But once I've got the task
> > done, Director stops listening me - the second job is not starting when
> > requested. Then bconsole stops, I have to exit ctrl+c, but reissuing
> > bconsle and here typing status dir gives that the backup is running.
> >
> > The problem is that the backup is not running. Director keeps it almost
> > fully silent. When I try to reload through bconsole, I'm experiencing
> > Director going like zombie - cannot connect. Debugging gives only this:
> >
> > atom-dir: bnet.c:670-0 who=client host=192.168.1.150 port=36131
> >
> > What's interesting, when I leave the Director alone it works OK, it
> > schedules backups and performs them. I had previously suspected that
> > something is wrong with scheduler as on before this troubleshooting I
> > couldn't even get the Director scheduling, but since few days it goes
> > right.
> >
> > This is the same issue as the guy here, but he hasn't found a clue:
> >
> > http://www.mail-archive.com/bacula-users AT lists.sourceforge DOT 
> > net/msg38279.h
> >tml
> >
> > I've just moved backups and database, recompiled Bacula, recreated the
> > database and started backups  but the same history goes. What this could
> > be, anyone?
> 
> Don't know if it's your case.
> 
> We have same trouble here with dir hanging after having run the first job.
> I've restart it with -d100 just to check what's happen.
> In the meantime, on the bacula server (which has been upgraded from
>  opensuse 11.1 to 11.2 ) I have found that postfix is throttling ... (
>  missing relay.db file in /etc/postfix : issue a postmap relay and restart
>  postfix ) After that all emails are working.
> 
> As inside my dir-config message bsmtp are connected to the internal
>  postfix, bsmtp was hanging ! And perharps bacula-dir too.
> 
> I've now running three scheduled jobs, and bacula-dir have done it's jobs.
> 
> What I suspect is : there's no bsmtp timeout ( if it could not connect it
>  return, but if it connect and nothing goes right in postfix (the
>  throttling case) it wait indefinitely and also the director ....
> 
> I will leave this configuration running 2 to 3 days just to be sure it was
>  that.
> 
> In the meantime, if you can check on your side, if you get some trouble
>  with bstmp to infirm or confirm.
> 
True, I've verified this too.

bsmtp goes zombie and bacula-dir waits on it. Solution is to usea another app 
for sending email or drop email notifications at all.

I wonder if its not a candidate to bug report?

Thanks,
JS

------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev 
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>