Bacula-users

Re: [Bacula-users] how to debug a job

2015-01-21 19:43:35
Subject: Re: [Bacula-users] how to debug a job
From: Bill Arlofski <waa-bacula AT revpol DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 21 Jan 2015 19:41:10 -0500
On 01/21/2015 05:13 PM, Dimitri Maziuk wrote:
> (Take 2)
> 
> I've a client with ~316GB to back up. Currently the backup's been
> running for 5 days and wrote 33GB to the spool file. Previous runs
> failed with
> 
>> User specified Job spool size reached: JobSpoolSize=49,807,365,050 
>> MaxJobSpoolSize=49,807,360,000
>> Writing spooled data to Volume. Despooling 49,807,365,050 bytes ...
>> Error: Watchdog sending kill after 518401 secs to thread stalled reading 
>> File daemon.

Hi Dimitri

Bacula has a hard-coded 6 day limit on a job's run time.   518401 seconds =
6.00001157 days, so it appears that is the cause for the watchdog killing the 
job.


> Why is it taking 5 days to write 33GB?

That is a good question. :)

If it has spooled, then despooled (written) 49,807,365,050 bytes to a volume
(most) things are working...

Does it ask you for a new volume?


> How do I find out what's taking so long? What's the debug level I should
> give to bacula-fd? Where do debug messages go? Anyone knows?


In bconsole, while the job is running, do :

* stat dir

then

* stat storage

is the job "BLOCKED", asking for a volume?

I suspect (am guessing :) that Bacula may have sent an "operator" email asking
for a new volume after the 49GB was despooled to the volume, but that bsmtp
was either unable to contact your configured MTA, or that the addresses in
your Messages { }  stanzas are misconfigured so you never knew that Bacula
needed a new volume, then time goes by and the watchdog killed the job.

Check the /var/lib/bacula/log (default location on Gentoo) to see if there is
more information in the job's log.

If nothing is obvious to you, paste the output for that job to the list and
someone should be able to help point out the problem.


Additionally, bsmtp is a one-shot mail delivery program. That means that if it
can not contact the email server configured with the -h switch, the message
will never be delivered.

I always recommend installing a local MTA (like postfix which is rock-solid
and takes 5 mins to install and configure for this purpose) on your Bacula
server(s).  Then configure your Messages { } stanzas to deliver email to the
postfix server running on localhost.  This way, all Bacula emails are
eventually delivered even if there is an intermittent lost of connectivity to
your email server.


Bill


-- 
Bill Arlofski
Reverse Polarity, LLC
http://www.revpol.com/
-- Not responsible for anything below this line --

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users