Bacula-users

Re: [Bacula-users] Two EOF

2009-05-08 17:28:32
Subject: Re: [Bacula-users] Two EOF
From: Martin Simmons <martin AT lispworks DOT com>
To: cc AT mail.3d DOT hu
Date: Fri, 8 May 2009 22:19:59 +0100
>>>>> On Thu, 07 May 2009 02:55:32 +0200, SZÉKELYI Szabolcs  said:
> 
> Martin Simmons wrote:
> >>>>>> On Sat, 02 May 2009 01:24:57 +0200, SZÉKELYI Szabolcs said:
> >> Martin Simmons wrote:
> >>>>>>>> On Wed, 29 Apr 2009 16:45:17 +0200, SZÉKELYI Szabolcs said:
> >>>> Hi,
> >>>>
> >>>> We configured our tape storage with
> >>>>
> >>>> Two EOF = Yes
> >>>> BSF at EOM = Yes
> >>>>
> >>>> and found that this works fine as long as the tape is not ejected and
> >>>> reloaded. After that, Bacula complains about number of files on the
> >>>> volume not matching that recorded in the catalog.
> >>> What is the number of files on the volume and the catalog in the error
> >>> message?
> >> Bacula says it can find only 686 files on the tape while there sould be
> >> 687. Here's the exact log from Bacula:
> >>
> >> ---8<------------------------------------------------
> >> 10-Apr 22:55 bacula-sd JobId 93: 3307 Issuing autochanger
> >>   "unload slot 25, drive 1" command.
> >> 10-Apr 22:57 bacula-dir JobId 93: Using Device "bacula-LTO3-0-Device"
> >> 10-Apr 22:57 bacula-sd JobId 93: 3301 Issuing autochanger
> >>   "loaded? drive 1" command.
> >> 10-Apr 22:57 bacula-sd JobId 93: 3302 Autochanger
> >>   "loaded? drive 1", result: nothing loaded.
> >> 10-Apr 22:57 bacula-sd JobId 93: 3304 Issuing autochanger
> >>   "load slot 21, drive 1" command.
> >> 10-Apr 22:58 bacula-sd JobId 93: 3305 Autochanger
> >>   "load slot 21, drive 1", status is OK.
> >> 10-Apr 22:58 bacula-sd JobId 93: Volume "000021" previously written,
> >>   moving to end of data.
> >> 10-Apr 22:59 bacula-sd JobId 93: Error: Bacula cannot write on tape
> >>   Volume "000021" because:
> >> The number of files mismatch! Volume=686 Catalog=687
> >> 10-Apr 22:59 bacula-sd JobId 93: Marking Volume "000021"
> >>   in Error in Catalog.
> >> 10-Apr 22:59 bacula-sd JobId 93: 3307 Issuing autochanger
> >>   "unload slot 21, drive 1" command.
> >> ---8<------------------------------------------------
> > 
> > It looks like some data has been lost from the tape, maybe because it was
> > overwritten.  That could happen if the Two EOF setting is wrong.
> 
> So bscan and the normal backup process report differing number of files
> with the same configuration for the same tape. This sounds like a bug in
> either.

Not necessarily.  If the configuration is broken, then they might differ as
well.


> >>> Which operating system and version of Bacula?
> >> The OS is CentOS 5.3.
> >>
> >> Since we were unable to find official packages for CentOS, our
> >> installation of Bacula is a simple rebuild of
> >> bacula-2.4.4-1.fc10.src.rpm found in the Fedora source repository.
> > 
> > Why are you using Two EOF with this device?  The recommended setup for Linux
> > systems is
> > 
> > Two EOF = No
> > BSF at EOM = No
> 
> We have an outstanding problem with the low-level tape device that we
> tried to track down. The symptom is:
> 
> 27-Apr 23:19 bacula-sd JobId 473: Error:
>   Unable to position to end of data on device "bacula-LTO3-1-Device"
>     (/dev/nst0): ERR=dev.c:895 ioctl
>       MTEOM error on "bacula-LTO3-1-Device" (/dev/nst0).
>         ERR=Input/output error.
> 
> and the related syslog entries:
> 
> Apr 27 23:19:07 backup.lvs.iif.hu kernel: st0: Current:
>   sense key: Medium Error
> Apr 27 23:19:07 backup.lvs.iif.hu kernel:     Add. Sense:
>   Recorded entity not found
> Apr 27 23:19:07 backup.lvs.iif.hu kernel:
> Apr 27 23:19:07 backup.lvs.iif.hu kernel: Info fld=0x7fff75
> Apr 27 23:19:07 backup.lvs.iif.hu kernel: st0: Current:
>   sense key: Medium Error
> Apr 27 23:19:07 backup.lvs.iif.hu kernel:     Add. Sense:
>   Medium format corrupted
> 
> We managed to reproduce the error using mt. However, when mt stops, we
> couldn't tell if it stops because of the error above or if it really
> reached the end of the written data. So we introduced Two EOF so that we
> can distinguish the two: if there are two successive EOF marks on the
> tape at the current position, we are at the end of the data, otherwise
> we ran into the error (again).
> 
> Maybe this error is similar to the one reported recently to this list by
> yvan. What do you think?

If it happens with mt, then it sounds like a problem with the driver or
firmware.  Is the driver set up correctly?  (See
http://www.bacula.org/manuals/en/problems/problems/Testing_Your_Tape_Drive.html#SECTION00435000000000000000)

Have you tried setting two-fms in stoptions?

Have you tried

Hardware End of Medium = No?

You could also try

Two EOF = Yes
BSF at EOM = No

but that would be very strange.

__Martin

------------------------------------------------------------------------------
The NEW KODAK i700 Series Scanners deliver under ANY circumstances! Your
production scanning environment may not be a perfect world - but thanks to
Kodak, there's a perfect scanner to get the job done! With the NEW KODAK i700
Series Scanner you'll get full speed at 300 dpi even with all image 
processing features enabled. http://p.sf.net/sfu/kodak-com
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>