Bacula-users

Re: [Bacula-users] SD freezing under FreeBSD 10?

2014-03-09 11:16:46
Subject: Re: [Bacula-users] SD freezing under FreeBSD 10?
From: Kern Sibbald <kern AT sibbald DOT com>
To: Robert Cousins <rec AT Rcousins DOT com>, bacula-users AT lists.sourceforge DOT net
Date: Sun, 09 Mar 2014 16:10:11 +0100
Hello,

I suspect that you may be running into a FreeBSD 10 bug where the
library or kernel sends signal 0 to Bacula.  Since signal 0 is not a
legal signal and should never happen, Bacula doesn't know what to do and
blows itself up.  My suggests are:

1. Backup to FreeBSD 9.1 (or whatever the prior version was).

or

2. Wait until the end of the month and try the new Bacula version, which
has several workarounds implemented for perceived FreeBSD 10 bugs.  I
haven't personally tested them though.

Best regards,
Kern

On 03/07/2014 05:09 PM, Robert Cousins wrote:
> I'm upgrading my older backup server to a new FreeBSD 10 box. (New 
> machine has lots of resources and I backup to disk.)
>
> After installing the director and storage daemons on the new server, I 
> pointed to the existing file daemons (which I also reconfigured) on the 
> clients spread around the network. I could read status and communicate 
> to/from the daemons. But when I try to do a backup nothing seemed to 
> work. Instead, I'd get an error such as
>
> 07-Mar 07:26 Colo8 JobId 94: Fatal error: backup.c:1190 Network send 
> error to SD. ERR=Broken pipe
>
> At that point, the CPU usage of the SD would go to zero and disk 
> throughput would go to zero -- and all progress would stop. (That is, if 
> 5 machines were trying to backup and one threw a fatal error, then 
> suddenly progress would stop on all 5 machines.) Webmin's display would 
> show that one client had a 'fatal error' and reported the others' status 
> as 'is running'. The various log files simply keep claiming Network 
> errors.  If Ii cancel the job with the fatal error, then progress starts 
> again for a short while until another job gets a fatal error and the 
> system freezes again. These fatal errors can occur after just seconds or 
> after an hour or two. But they occur so often that essentially no backup 
> jobs can hope to complete.
>
> The network is known and solid, not overloaded nor undergoing change. 
> Other network services run without problems.
>
> I have been fighting with this for several days and have succeeded in 
> getting a total of 1 backup jobs to run to completion -- on a newly 
> built but different FreeBSD 10 server. (The problem occurs even on the 
> backup server.) I've upgraded the FDs to the latest revision. This fails 
> with FDs on FreeBSD, Linux and Windows.
>
> I can use telnet to communicate from each machine to each required port. 
> For now, the machines are running with firewalls down.
>
> The odds are 99% that I've done something stupid. Could someone please 
> point me to the proper FAQ or suggest a debugging strategy?
>
> ------------------------------------------------------------------------------
> Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
> With Perforce, you get hassle-free workflows. Merge that actually works. 
> Faster operations. Version large binaries.  Built-in WAN optimization and the
> freedom to use Git, Perforce or both. Make the move to Perforce.
> http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>


------------------------------------------------------------------------------
Subversion Kills Productivity. Get off Subversion & Make the Move to Perforce.
With Perforce, you get hassle-free workflows. Merge that actually works. 
Faster operations. Version large binaries.  Built-in WAN optimization and the
freedom to use Git, Perforce or both. Make the move to Perforce.
http://pubads.g.doubleclick.net/gampad/clk?id=122218951&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>