Bacula-users

Re: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to prevent duplicate jobs and reduce backup time

2011-10-13 12:21:02
Subject: Re: [Bacula-users] seeking advice re. splitting up large backups -- dynamic filesets to prevent duplicate jobs and reduce backup time
From: Thomas Lohman <thomasl AT mtl.mit DOT edu>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 13 Oct 2011 12:18:58 -0400
> In an effort to work around the fact that bacula kills long-running
> jobs, I'm about to partition my backups into smaller sets. For example,
> instead of backing up:

Since we may end up having jobs that run for more than 6 days, I was 
pretty curious to see where in the code (release 5.0.3) this insanity 
check was happening.  Looking at your previous thread's error message, I 
was able to track down these checks to the jcr_timeout_check routine in 
jcr.c.

But after a brief look at the code it looks to me like this only occurs 
if the socket connection is essentially stuck and no read/writes are 
occurring over it (thus the reason Kern probably labeled it an insanity 
check).  This explains why other folks have said that they do have jobs 
that have run > 6 days.  Are you actually seeing an active job (i.e. 
it's in the middle of writing data from the client when it's killed)? 
Could it be that it is in the middle of de-spooling a very large job 
(and/or waiting for operator intervention) and that is when this occurs? 
  I could see that happening since no traffic is flowing over the 
connection to the client but the job is still active thus the client 
connection probably is as well.

In any event, if you have access to the source code (5.0.3 - which is 
what I'm looking at) and are comfortable making changes to it then I 
believe all you need to do is change line 75 in lib/bsock.c and line 687 
in lib/bnet.c to something longer than 6 days.  This may be simpler than 
re-working your entire backup scheme to avoid the issue.


--tom




------------------------------------------------------------------------------
All the data continuously generated in your IT infrastructure contains a
definitive record of customers, application performance, security
threats, fraudulent activity and more. Splunk takes this data and makes
sense of it. Business sense. IT sense. Common sense.
http://p.sf.net/sfu/splunk-d2d-oct
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users