Bacula-users

Re: [Bacula-users] Backup of all jobs fail if host unavailable

2008-09-02 16:10:09
Subject: Re: [Bacula-users] Backup of all jobs fail if host unavailable
From: "Botha, Jacques (FNB)" <JacquesB AT fnb.co DOT za>
To: "Dan Langille" <dan AT langille DOT org>
Date: Tue, 2 Sep 2008 22:07:35 +0200
On Tue, 2008-09-02 at 16:01 -0400, Dan Langille wrote:
> Botha, Jacques (FNB) wrote:
> > On Tue, 2008-09-02 at 15:46 -0400, Dan Langille wrote:
> >> Botha, Jacques (FNB) wrote:
> >>> Hi 
> >>>
> >>> All my backups are scheduled for the same time, then queue with the same
> >>> priority, and run one at a time as the previous jobs finishes.
> >>>
> >>> Today I've got a machine that is unavailable due to a hardware fault.
> >>> Naturally the backup for this machine failed, but, also every backup
> >>> that was in the queue for all other machines after this one ! 
> >>>
> >>> Please help !
> >>>
> >>> I'm running bacula 2.4.2 on CentOS 5.
> >> Perhaps if you supplied the failure messages...
> >>
> > 
> > 
> > Sure
> > 
> > 
> > 2008-09-02 20:15:19Bacula_Director JobId 175: Fatal error: Max wait time
> > exceeded. Job canceled.
> > 2008-09-02 20:15:19Bacula_Director JobId 176: Fatal error: Max wait time
> > exceeded. Job canceled.
> > 2008-09-02 20:15:19Bacula_Director JobId 177: Fatal error: Max wait time
> > exceeded. Job canceled.
> > 
> > And so forth until the last job.
> > 
> > 
> > 
> > Some more config information which might be usefull:   
> > 
> > Maximum Concurrent Jobs = 1
> > 
> > each job has  Max Wait Time = 10 minutes defined.
> > 
> > 
> > So my understanding is that the unavailable machine would have blocked
> > all other backups for 10 minutes until it timed out, but then they
> > should have continued, not be cancelled as well.
> > 
> > Where am I going wrong ?
> 
> Max-wait time is perhaps not what you want.  Remove it or reconsider its 
> use.
> 

According to the Bacula Manual: 

Max Wait Time = <time> The time specifies the maximum allowed
time that a job may block waiting for a resource (such as waiting
for a tape to be mounted, or waiting for the storage or file daemons
to perform their duties), counted from the when the job starts, (not
necessarily the same as when the job was scheduled).

So the unavailable machine, could block other jobs for 10 minutes.  Why
did the other jobs time out as well ?  They were not started yet, only
scheduled ?

If Max Wait Time is not what I am after, could you please point me in
the right direction ??





To read FirstRand Bank's Disclaimer for this email click on the following 
address or copy into your Internet browser: 
https://www.fnb.co.za/disclaimer.html 

If you are unable to access the Disclaimer, send a blank e-mail to
firstrandbankdisclaimer AT fnb.co DOT za and we will send you a copy of the 
Disclaimer.

Attachment: signature.asc
Description: This is a digitally signed message part

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users