Bacula-users

Re: [Bacula-users] spool disk filesystem, checksums

2014-12-05 08:13:43
Subject: Re: [Bacula-users] spool disk filesystem, checksums
From: Daniel Pocock <daniel AT pocock DOT pro>
To: Josh Fisher <jfisher AT pvct DOT com>, bacula-users AT lists.sourceforge DOT net
Date: Fri, 05 Dec 2014 14:11:28 +0100
On 05/12/14 13:57, Josh Fisher wrote:
> On 12/5/2014 1:03 AM, Daniel Pocock wrote:
>> On 05/12/14 00:43, Cejka Rudolf wrote:
>>> Daniel Pocock wrote (2014/12/04):
>>>> On 04/12/14 18:35, Kern Sibbald wrote:
>>>>> On 12/03/2014 08:49 PM, Daniel Pocock wrote:
>>>>>> Does Bacula checksum content on the spool disk before sending it to tape?
>>>>>>
>>>>>> To be more explicit, if there is a single bit error on the spool disk,
>>>>>> will it be noticed before going onto tape or would it only be noticed in
>>>>>> future when a file is taken off the tape?
>>>>> Unless you are running ZFS for the spool disk, the error will only be
>>>>> noticed when the data is read from the tape.
>>>>>
>>>> In that case, it sounds like a good idea to use ZFS or Btrfs with
>>>> checksums enabled
>>> Hard drives use error correction/detection codes, so single bit error
>>> without any error indication is unlikely. Especially in case of spool
>>> disks, where datas are read shortly after write.
>>>
>> Unfortunately, that is completely untrue.  Disks and IO subsystems do
>> not provide any guarantees that they will return the exact data that was
>> written.  That is why modern filesystems have checksums.
>>
>> The actual corruption of data often occurs in the writing phase, so
>> whether you read back the sector in 5 minutes or 5 weeks, it will always
>> come back with some bit changed.  I've seen this more than once
>> unfortunately.
> A write error is not detected without a read. There is no way to know, 
> even at the hardware level, whether a particular area of the disk has 
> the correct magnetization / charge without a subsequent read. Then there 
> are RAM buffers and controllers in between the FS and the disk platter / 
> MLC cell. When the FS detects a checksum error, it really has no way to 
> know whether it was due to an incorrect area of disk or an incorrect bit 
> of RAM, but it knows that it didn't read back what should have been 
> written. Without the hardware error detection, the FS may detect false 
> positives, while without the FS checksum there is no way to detect false 
> negatives. Both hardware level and FS level error detection are 
> required. That is why I think the ZFS claim of "not needing any special 
> hardware" is a bit misleading, or at least depends on the definition of 
> "special hardware".


Should this go in the bug tracker then?  A feature request for Bacula to
assume the spool disk filesystem may not be using checksums and
therefore Bacula should checksum content on the spool disk itself when
handing it off to tape?



------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users