Bacula-users

Re: [Bacula-users] spool disk filesystem, checksums

2014-12-05 08:03:50
Subject: Re: [Bacula-users] spool disk filesystem, checksums
From: Josh Fisher <jfisher AT pvct DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 05 Dec 2014 07:57:52 -0500
On 12/5/2014 1:03 AM, Daniel Pocock wrote:
>
> On 05/12/14 00:43, Cejka Rudolf wrote:
>> Daniel Pocock wrote (2014/12/04):
>>> On 04/12/14 18:35, Kern Sibbald wrote:
>>>> On 12/03/2014 08:49 PM, Daniel Pocock wrote:
>>>>> Does Bacula checksum content on the spool disk before sending it to tape?
>>>>>
>>>>> To be more explicit, if there is a single bit error on the spool disk,
>>>>> will it be noticed before going onto tape or would it only be noticed in
>>>>> future when a file is taken off the tape?
>>>> Unless you are running ZFS for the spool disk, the error will only be
>>>> noticed when the data is read from the tape.
>>>>
>>> In that case, it sounds like a good idea to use ZFS or Btrfs with
>>> checksums enabled
>> Hard drives use error correction/detection codes, so single bit error
>> without any error indication is unlikely. Especially in case of spool
>> disks, where datas are read shortly after write.
>>
> Unfortunately, that is completely untrue.  Disks and IO subsystems do
> not provide any guarantees that they will return the exact data that was
> written.  That is why modern filesystems have checksums.
>
> The actual corruption of data often occurs in the writing phase, so
> whether you read back the sector in 5 minutes or 5 weeks, it will always
> come back with some bit changed.  I've seen this more than once
> unfortunately.

A write error is not detected without a read. There is no way to know, 
even at the hardware level, whether a particular area of the disk has 
the correct magnetization / charge without a subsequent read. Then there 
are RAM buffers and controllers in between the FS and the disk platter / 
MLC cell. When the FS detects a checksum error, it really has no way to 
know whether it was due to an incorrect area of disk or an incorrect bit 
of RAM, but it knows that it didn't read back what should have been 
written. Without the hardware error detection, the FS may detect false 
positives, while without the FS checksum there is no way to detect false 
negatives. Both hardware level and FS level error detection are 
required. That is why I think the ZFS claim of "not needing any special 
hardware" is a bit misleading, or at least depends on the definition of 
"special hardware".

>
> This looks at some of the issues;
>
> https://blogs.oracle.com/bonwick/entry/zfs_end_to_end_data
>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users