Bacula-users

Re: [Bacula-users] file dir computer restarts resulting in mismatched file count error on tape

2008-11-06 06:03:33
Subject: Re: [Bacula-users] file dir computer restarts resulting in mismatched file count error on tape
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 06 Nov 2008 11:59:26 +0100
Hi,

05.11.2008 21:48, Bob Hetzel wrote:
> I think I may have found a bug or perhaps at least a design limitation 
> in Bacula.  I'm running version 2.4.3.  I use spooling and I back up to 
> tape.  I seem to get periodic errors logged about a file count mismatch 
> resulting in a tape getting marked as status "Error".
> 
> I think I was able to determine the cause of one of these... the backup 
> client running Windows Vista was restarted in the middle of the backup. 
>   The next time that tape is used, the file count mismatch is noted and 
> the tape is marked in error.  Additionally, bacula apparently filled up 
> the tape before the client was restarted, so I don't know for sure if 
> that had something to do with the problem.
> 
> This raises some questions...
> 
> 1) When a client goes away in the middle of a backup, bacula should 
> handle that properly, but it appears to be missing a part of what it 
> would normally do when a backup completes successfully.

Bacula does handle that properly, but it can take a *long* time for 
Bacula to notice the client is gone and, accordingly, cancel the job 
in question.

Once a job is properly canceled, i.e. the SD stopped working on it and 
notified the DIR, the catalog will be in a state representing the 
actual tape's contents.

Unfortunately, there seem to exist cases where the SD doesn't notice 
the FD is gone, and thus waits virtually forever for it. I could never 
reproduce that, IIRC.

> 2) In theory, if it can't do a whole backup the files that it does get 
> onto the tape should be recoverable too but I've not checked if it 
> handles it such that they are.

Should work... You'll have to initiate the restore by giving the job 
id of the failed job, though. Automatic selection ignores failed jobs 
(which is reasonable, IMO).

> 
> 3) When a tape file count mismatch is found, can't it just correct the 
> mismatch, send an e-mail and move on, w/o marking the tape as status 
> Error when the tape is actually fine?

No, because there might be many reasons for the file count mismatch. 
In most cases it's just a off-by-one situation, but imagine the tape 
in question was used by another Bacula instance or even a different 
program meanwhile... the tape contents wouldn't be what is noted in 
the catalog, and that's a situation where you probably don't want to 
append to or restore from the tape...

Arno

-- 
Arno Lehmann
IT-Service Lehmann
Sandstr. 6, 49080 Osnabrück
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>