Bacula-users

Re: [Bacula-users] Re-read of last block OK, but block numbers differ

2008-11-27 14:59:00
Subject: Re: [Bacula-users] Re-read of last block OK, but block numbers differ
From: Martin Simmons <martin AT lispworks DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 27 Nov 2008 19:53:50 GMT
>>>>> On Thu, 27 Nov 2008 17:19:21 +0000, Allan Black said:
> 
> Hi, all,
> 
> Can anyone help me analyse the problem here? On more than
> one occasion, a DDS3 drive has produced this, when it reaches
> the end of a tape:
> 
> 23-Nov 22:03 gershwin-dir JobId 506: Using Device "DDS3-0"
> 23-Nov 22:06 gershwin-sd JobId 506: End of Volume "MainCatalog-004" at 
> 57:4963 on device "DDS3-0" (/dev/rmt/1cbn). Write of 64512 bytes got 0.
> 23-Nov 22:06 gershwin-sd JobId 506: Error: Re-read of last block OK, but 
> block numbers differ. Last block=4963 Current block=4963.
> 23-Nov 22:06 gershwin-sd JobId 506: End of medium on Volume "MainCatalog-004" 
> Bytes=28,148,843,520 Blocks=436,334 at 23-Nov-2008 22:06.
> 
> Having checked the SD source, I believe what is happening is
> that the SD tried to write block 4963 to the tape, but got EOM
> (End Of Medium) back from the drive. It then re-read the last
> block successfully, but ....
> 
> Because of the EOM, the SD expected that it had failed to write
> block 4963. When it re-read the last block on the tape, it expected
> the block number (which is in the block header) to be 4962, hence
> the error message. However, I think that the drive successfully
> wrote the block, but also returned a SCSI sense indicating that the
> tape was now full.
> 
> I put that tape back in the drive and read it with bls -k, getting
> this:
> 
> [...]
> Block: 4961 size=64512
> Block: 4962 size=64512
> Block: 4963 size=64512
> 26-Nov 13:44 bls JobId 0: End of file 58 on device "dds3" (/dev/rmt/1cbn), 
> Volume "MainCatalog-004"
> 
> Which makes me think that block 4963 has indeed been written to the
> tape.
> 
> However: I also think that that catalog backup has actually been
> corrupted, because I think the SD would write the data from block
> 4963 at the beginning of the new tape, which means if I restored
> that particular backup, I would find a block of data repeated.
> 
> [Since this was "only" a catalog backup, it was no problem to run it
> again manually, so I do actually have a good catalog backup!]
> 
> I do not believe this is a Solaris or SCSI problem; DLT drives (and
> autochangers) work perfectly on the same card (although on a different
> segment of the bus). I suspect (yuck) I may have to experiment with
> the configuration switches on the drive.
> 
> Has anyone come across a similar situation before and, if so, is
> able to point me in the correct direction to debug it? In particular,
> does anyone have any experience of how a SCSI drive is supposed to
> behave at EOM?
> 
> Bacula 2.4.3
> Solaris 10 x86
> HP C1557A DDS-3 autoloader

I would expect it to be OK, as long as the catalog says that block 4963 is on
the second tape.

Have you tried the btape fill test?  That should check Bacula's logic for this
case.

__Martin

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>