Bacula-users

Re: [Bacula-users] spool disk filesystem, checksums

2014-12-05 08:30:39
Subject: Re: [Bacula-users] spool disk filesystem, checksums
From: Josh Fisher <jfisher AT pvct DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 05 Dec 2014 08:25:07 -0500
On 12/5/2014 8:11 AM, Daniel Pocock wrote:
> On 05/12/14 13:57, Josh Fisher wrote:
>> On 12/5/2014 1:03 AM, Daniel Pocock wrote:
>>> On 05/12/14 00:43, Cejka Rudolf wrote:
>>>> Daniel Pocock wrote (2014/12/04):
>>>>> On 04/12/14 18:35, Kern Sibbald wrote:
>>>>>> On 12/03/2014 08:49 PM, Daniel Pocock wrote:
>>>>>>> Does Bacula checksum content on the spool disk before sending it to 
>>>>>>> tape?
>>>>>>>
>>>>>>> To be more explicit, if there is a single bit error on the spool disk,
>>>>>>> will it be noticed before going onto tape or would it only be noticed in
>>>>>>> future when a file is taken off the tape?
>>>>>> Unless you are running ZFS for the spool disk, the error will only be
>>>>>> noticed when the data is read from the tape.
>>>>>>
>>>>> In that case, it sounds like a good idea to use ZFS or Btrfs with
>>>>> checksums enabled
>>>> Hard drives use error correction/detection codes, so single bit error
>>>> without any error indication is unlikely. Especially in case of spool
>>>> disks, where datas are read shortly after write.
>>>>
>>> Unfortunately, that is completely untrue.  Disks and IO subsystems do
>>> not provide any guarantees that they will return the exact data that was
>>> written.  That is why modern filesystems have checksums.
>>>
>>> The actual corruption of data often occurs in the writing phase, so
>>> whether you read back the sector in 5 minutes or 5 weeks, it will always
>>> come back with some bit changed.  I've seen this more than once
>>> unfortunately.
>> A write error is not detected without a read. There is no way to know,
>> even at the hardware level, whether a particular area of the disk has
>> the correct magnetization / charge without a subsequent read. Then there
>> are RAM buffers and controllers in between the FS and the disk platter /
>> MLC cell. When the FS detects a checksum error, it really has no way to
>> know whether it was due to an incorrect area of disk or an incorrect bit
>> of RAM, but it knows that it didn't read back what should have been
>> written. Without the hardware error detection, the FS may detect false
>> positives, while without the FS checksum there is no way to detect false
>> negatives. Both hardware level and FS level error detection are
>> required. That is why I think the ZFS claim of "not needing any special
>> hardware" is a bit misleading, or at least depends on the definition of
>> "special hardware".
>
> Should this go in the bug tracker then?  A feature request for Bacula to
> assume the spool disk filesystem may not be using checksums and
> therefore Bacula should checksum content on the spool disk itself when
> handing it off to tape?

Well, that would be a feature request, rather than a bug, but I don't 
think that is needed. The only way to be sure a tape is correct is to 
read it. Bacula already has verify jobs that read the tape and check it 
against the client's file checksums.


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users