Bacula-users

Re: [Bacula-users] spool disk filesystem, checksums

2014-12-06 03:34:44
Subject: Re: [Bacula-users] spool disk filesystem, checksums
From: Kern Sibbald <kern AT sibbald DOT com>
To: Daniel Pocock <daniel AT pocock DOT pro>, Josh Fisher <jfisher AT pvct DOT com>, bacula-users AT lists.sourceforge DOT net
Date: Sat, 06 Dec 2014 09:29:23 +0100
On 12/05/2014 02:11 PM, Daniel Pocock wrote:
> On 05/12/14 13:57, Josh Fisher wrote:
>> On 12/5/2014 1:03 AM, Daniel Pocock wrote:
>>> On 05/12/14 00:43, Cejka Rudolf wrote:
>>>> Daniel Pocock wrote (2014/12/04):
>>>>> On 04/12/14 18:35, Kern Sibbald wrote:
>>>>>> On 12/03/2014 08:49 PM, Daniel Pocock wrote:
>>>>>>> Does Bacula checksum content on the spool disk before sending it to 
>>>>>>> tape?
>>>>>>>
>>>>>>> To be more explicit, if there is a single bit error on the spool disk,
>>>>>>> will it be noticed before going onto tape or would it only be noticed in
>>>>>>> future when a file is taken off the tape?
>>>>>> Unless you are running ZFS for the spool disk, the error will only be
>>>>>> noticed when the data is read from the tape.
>>>>>>
>>>>> In that case, it sounds like a good idea to use ZFS or Btrfs with
>>>>> checksums enabled
>>>> Hard drives use error correction/detection codes, so single bit error
>>>> without any error indication is unlikely. Especially in case of spool
>>>> disks, where datas are read shortly after write.
>>>>
>>> Unfortunately, that is completely untrue.  Disks and IO subsystems do
>>> not provide any guarantees that they will return the exact data that was
>>> written.  That is why modern filesystems have checksums.
>>>
>>> The actual corruption of data often occurs in the writing phase, so
>>> whether you read back the sector in 5 minutes or 5 weeks, it will always
>>> come back with some bit changed.  I've seen this more than once
>>> unfortunately.
>> A write error is not detected without a read. There is no way to know, 
>> even at the hardware level, whether a particular area of the disk has 
>> the correct magnetization / charge without a subsequent read. Then there 
>> are RAM buffers and controllers in between the FS and the disk platter / 
>> MLC cell. When the FS detects a checksum error, it really has no way to 
>> know whether it was due to an incorrect area of disk or an incorrect bit 
>> of RAM, but it knows that it didn't read back what should have been 
>> written. Without the hardware error detection, the FS may detect false 
>> positives, while without the FS checksum there is no way to detect false 
>> negatives. Both hardware level and FS level error detection are 
>> required. That is why I think the ZFS claim of "not needing any special 
>> hardware" is a bit misleading, or at least depends on the definition of 
>> "special hardware".
>
> Should this go in the bug tracker then?  A feature request for Bacula to
> assume the spool disk filesystem may not be using checksums and
> therefore Bacula should checksum content on the spool disk itself when
> handing it off to tape?

If you look at the statistics for what is a performance problem in the
SD, you will find that it spends a very large amount of time in the
crc32 algorithm to create block checksums.  Were it an SHA1 or SHA-256
algorithm it would be even more expensive.  Thus I am not too
enthusiastic to start adding checksums to the spool writing/reading
code.  If you need or want this kind of checking, it seems to me much
simpler to use a checksumming filesystem such as ZFS or btrfs -- that
the performance hit may be much less than if the code were done in
Bacula because the kernel is more multithread, and more importantly, it
will allow Bacula developers to concentrate on adding new features.

Best regards,
Kern

>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
> http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>


------------------------------------------------------------------------------
Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
from Actuate! Instantly Supercharge Your Business Reports and Dashboards
with Interactivity, Sharing, Native Excel Exports, App Integration & more
Get technology previously reserved for billion-dollar corporations, FREE
http://pubads.g.doubleclick.net/gampad/clk?id=164703151&iu=/4140/ostg.clktrk
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users