Bacula-users

Re: [Bacula-users] Full backup fails after a few days with"Fatalerror: Network error with FD during Backup: ERR=Interrupted systemcall"

2011-09-26 14:21:29
Subject: Re: [Bacula-users] Full backup fails after a few days with"Fatalerror: Network error with FD during Backup: ERR=Interrupted systemcall"
From: Marcus Hallberg <marcus AT wimlet DOT se>
To: stevecs AT chaven DOT com
Date: Mon, 26 Sep 2011 20:18:38 +0200 (CEST)
Hi!

This is the errormessage that I got.

 26-Sep 17:15 neo-dir JobId 45532: Error: Watchdog sending kill after 518421 secs to thread stalled reading File daemon.

I broke the seconds down into days and it got to be: 6,000243056

I think that it would be to much of a coincidence that bacula has had a 6 day limit and something else kills it within 21 seconds of that limit. 

/marcus
--



/Marcus Hallberg
Wimlet Consulting AB
Gamla Varvsgatan 1
414 59 Göteborg
Tel: 031-3107000
Direkt 031-3107010
e-post: marcus AT wimlet DOT se
hemsida: www.wimlet.se

Från: "Steve Costaras" <stevecs AT chaven DOT com>
Till: jma AT schaubroeck DOT be, "R. Leigh Hennig" <rlh1533 AT gmail DOT com>
Kopia: bacula-users AT lists.sourceforge DOT net
Skickat: måndag, 26 sep 2011 16:45:40
Ämne: Re: [Bacula-users] Full backup fails after a few days with"Fatalerror: Network error with FD during Backup: ERR=Interrupted systemcall"


I'm running 5.0.3 and don't see this 6-day limit for jobs and do not have max run time set in the config files.    Pretty much all of my full backup jobs run into the 15-30 day range due to the shear size of the backup and the constant pause/flushing of the spool.  

I would think you're running into a different problem (going through a firewall or some other device that is timing out connections for long-running tcp).



-----Original Message-----
From: Jeremy Maes [mailto:jma AT schaubroeck DOT be]
Sent: Monday, September 26, 2011 09:28 AM
To: 'R. Leigh Hennig'
Cc: bacula-users AT lists.sourceforge DOT net
Subject: Re: [Bacula-users] Full backup fails after a few days with "Fatal error: Network error with FD during Backup: ERR=Interrupted system call"

Op 26/09/2011 16:01, R. Leigh Hennig schreef:
> Morning,
>
> I have a client that whenever I try to do a full backup, after 6 days,
> the backup fails with this error:
>
> Fatal error: Network error with FD during Backup: ERR=Interrupted
> system call
>
>
> In bacula-dir.conf, for that job definition, I have this:
>
> Full Max Run Time = 1036800
>
> So it should be able to run for up to 12 days, but after the 6th day,
> it's stopping. During that time it writes about 4.7 TB (with another 1
> TB to go). Running CentOS 5.5 with Bacula 5.0.2. Any thoughts?
>
>
> Thanks,
>
Bacula has a hardcoded time limit on jobs of 6 days. Kern called it an
"insanity check" as any job that runs that long isn't really something
you'd want ...

See
http://www.mail-archive.com/bacula-users AT lists.sourceforge DOT net/msg20159.html
for a discussion on the mailing list from the past, and a pointer on
where to change the time limit in the code if you wish.

Last time this was asked on the list someone pointed to a possible
configuration option to override the hardcoded limit that should've been
added by now, but given the 0 responses to that I can't say if it
actually exists.

Regards,
Jeremy

**** DISCLAIMER ****
http://www.schaubroeck.be/maildisclaimer.htm

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2dcopy1
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>