I just want to say that I have seen this recently too. I have a set of
LTO-3 tapes that I am now relabeling as I use them because the labels
have become unreadable. All I can add right now is that that particular
server is running Bacula v5.0.3 on Ubuntu.
On 1/24/2012 6:25 PM, mark.bergman AT uphs.upenn DOT edu wrote:
> In the message dated: Tue, 24 Jan 2012 14:30:44 PST,
> The pithy ruminations from Steve Ellis on
> <Re: [Bacula-users] critical error -- tape labels get corrupted, previous
> backups
> unreadable> were:
> => On 1/24/12 2:22 PM, mark.bergman AT uphs.upenn DOT edu wrote:
> => > In the message dated: Tue, 24 Jan 2012 19:09:15 GMT,
> => > The pithy ruminations from Martin Simmons on
> => >
> => >
> => > Thanks for replying.
> => >
> => >
> => > <Re: [Bacula-users] critical error -- tape labels get corrupted,
> previous backu
> => > ps unreadable> were:
> => > => >>>>> On Mon, 23 Jan 2012 18:47:31 -0500, mark bergman said:
> => > => >
> => > => > I'm experiencing a critical problem where tape labels on
> volumes with data
> => > => > get corrupted, leaving all data on the tape inaccessible to
> bacula.
> => > => >
> => > => > I'm running bacula 5.2.2 built from source, under Linux
> (CentOS 5.7
> => > => > x86_64).
> => > => >
> => > => > This problem has happened with approximately 15 tapes over
> approximately 6
> => > => > months, mostly new LTO-4 media, but some LTO-3 media that's
> being reused.
> => > => > The problem is sporadic, appearing in approximately 1 out of
> 60 tapes
> => > => > per week.
> => > => >
> => > => > I do not think the issue is related to the physical media or
> the tape
> => > => > drives. One tape was last written successfully when in drive
> 0, then appears
> => > => > corrupt when a later job tries to use is in drive 1. Another
> tape was last
> => > => > written successfully when in drive 1, then appears corrupt
> when a later job
> => > => > tries to use it in drive 0.
> => > =>
> => > => Why do think it isn't a hardware problem?
> => > =>
> => >
> => > I don't think it's a hardware problem because:
> => >
> => > the vast majority of tape access (read or write) doesn't result
> => > in corrupted labels
> => >
> => > there aren't SCSI, tape, or bacula errors reported during
> backups
> => > (within Bacula, the OS, or the tape library console)
> => >
> => > the tapes are readable--though the data is not usable by bacula
> => >
> => > the problem occurs on tapes that have been written and read in
> => > both drives (this doesn't rule out some common element in the
> => > tape library)
> => >
> => Perhaps someone else already suggested this and I missed it--this looks
> => like somehow the tapes were rewound behind bacula's back--could that
> => explain the behavior you are seeing?
>
> Thanks for suggesting this. I appreciate the feedback.
>
> Yeah, it would explain the symptom, but if I understand it correctly,
> this would require:
>
> bacula loads a tape with a valid label
>
> writes N backup jobs to the tape
>
> "something" rewinds the tape
>
> bacula writes to the beginning of the tape, corrupting the label (but
> believing the job to be successful)
>
> bacula unloads the tape
>
> at some later point, bacula loads the tape for another
> job and cannot read the label
>
> It is difficult to think of a scenario where "something rewinds" but
> does not unload the tape.
>
> We don't have any software other than bacula that reads/writes from tape.
>
> Attempts to access the tape drives (not the autochanger) manually with 'mt'
> while bacula-sd is running are blocked as bacula-sd has a lock on the tape
> devices.
>
> It is possible to use "mtx" to unload tapes from the drives while bacula is
> running, and I believe that unloading an LTO tape implies that it is rewound.
>
> However, I can't think of any scenario where a tape is unloaded without
> updating the "in changer" flag in the database, and where "update slots" is
> not called after the tape is unloaded, and where bacula tries to append to the
> same tape, and where the tape is loaded without triggering an attempt to read
> the label, and the 'append' therefore overwrites the beginning of the
> tape...but maybe that's possible.
>
> I may just change the bacula-dir and bacula-sd init scripts to call
> mtx-changer and unload all drives before starting either daemon. This would
> help ensure consistency, regardless of which daemon starts first or is later
> restarted.
>
> Thanks,
>
> Mark
>
> => -se
> =>
>
> ------------------------------------------------------------------------------
> Keep Your Developer Skills Current with LearnDevNow!
> The most comprehensive online learning library for Microsoft developers
> is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
> Metro Style Apps, more. Free future releases when you subscribe now!
> http://p.sf.net/sfu/learndevnow-d2d
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
------------------------------------------------------------------------------
Keep Your Developer Skills Current with LearnDevNow!
The most comprehensive online learning library for Microsoft developers
is just $99.99! Visual Studio, SharePoint, SQL - plus HTML5, CSS3, MVC3,
Metro Style Apps, more. Free future releases when you subscribe now!
http://p.sf.net/sfu/learndevnow-d2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|