Bacula-users

Re: [Bacula-users] watch dog kills running jobs

2013-05-28 06:25:01
Subject: Re: [Bacula-users] watch dog kills running jobs
From: Konstantin Khomoutov <flatworm AT users.sourceforge DOT net>
To: Mihai Sătmărean <mihai.satmarean AT trixter DOT de>
Date: Tue, 28 May 2013 14:20:59 +0400
On Tue, 28 May 2013 10:37:33 +0200
Mihai Sătmărean <mihai.satmarean AT trixter DOT de> wrote:

> lately we try to save on tapes arount 22 TB, and after 6 days and s
> 14-May 18:39 de001bs002-dir JobId 818: Error: Watchdog sending kill
> after 518425 secs to thread stalled reading File daemon. 14-May 18:39
> de001bs002-dir JobId 818: Fatal error: Network error with FD during
> Backup: ERR=Interrupted system call 14-May 18:39 de001bs002-dir JobId
> 818: Fatal error: No Job status returned from FD. . .
> .
>    Elapsed time:           6 days 25 secs
> .
> .
> all jons die exactly after the same time.
> 
> Is there a setting to increase the watch dog period, or to make it
> aware that the job is actually running? Can this be a bug?

This is a FAQ question.  Bacula has the hardcoded limit of 6 days
for a job to complete.  This is not tweakable (and supposedly it won't
ever be): the position of Bacula devs is that if your job takes that
long to complete you're doing something wrong -- consider splitting the
task to several jobs.

------------------------------------------------------------------------------
Try New Relic Now & We'll Send You this Cool Shirt
New Relic is the only SaaS-based application performance monitoring service 
that delivers powerful full stack analytics. Optimize and monitor your
browser, app, & servers with just a few lines of code. Try New Relic
and get this awesome Nerd Life shirt! http://p.sf.net/sfu/newrelic_d2d_may
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>