BackupPC-users

Re: [BackupPC-users] Backing up a BackupPC server

2009-06-04 05:42:58
Subject: Re: [BackupPC-users] Backing up a BackupPC server
From: Tino Schwarze <backuppc.lists AT tisc DOT de>
To: backuppc-users AT lists.sourceforge DOT net
Date: Thu, 4 Jun 2009 11:35:46 +0200
Hi there,

(I already felt like I was going to look dumb or anxious by writing what
I wrote...)

On Wed, Jun 03, 2009 at 01:09:38PM -0400, Jeffrey J. Kosowsky wrote:
> Tino Schwarze wrote at about 18:39:26 +0200 on Wednesday, June 3, 2009:
>  > > > I recently heard about lessfs, which runs on top of FUSE to provide
>  > > > a file system that does block-level de-duplication.  See:
>  > > > 
>  > > >     http://www.lessfs.com
>  > > >     https://sourceforge.net/project/showfiles.php?group_id=257120
>  > > >     http://tokyocabinet.sourceforge.net/index.html
>  > > > 
>  > > > The actual storage is several very large (sparse?) files on any
>  > > > file system(s) of your choice.  It should provide all the benefits
>  > > > you expect: no issues of local limitations on hardlink counts,
>  > > > meta-data etc, and the database files can be copied or rsynced.
>  > > > I'm corresponding with the author to see if some additional useful
>  > > > features could be added.
>  > 
>  > Well, we've already got MD4 checksums of file blocks. And if I
>  > understand everything correctly, we DO GET collisions, therefore the
>  > hash chains.
> 
> First, the hash chains are based on *partial* file *md5* (not md4)
> sums.
> 
> Second, the collisions only occur because the hash is only done on the first
> and eighth (or last for small files) 128K block. So, obviously you will
> have collisions for large files that have the same first and eighth
> block. 

That was the first flaw of my thoughts... So I would have to scan my
pool and compare first and eigth 128k block (e.g. 0-128k and 1M-1M128k
or is it 896k-1M?) for matches? Maybe I'll try that, out of sheer
curiousity (if I find the time to script it).

>  > Of course, this if for 256k blocks, IIRC. And "only" 128 bit hashes.
>  > But I don't like the idea of relying on probabilities. I've got enough
>  > uncertainties by flaky hardware, bugs etc.
> 
> We rely on probabilities in all aspects of life. Nothing is certain.

I know that. Sometimes I'm paranoid - I just like to get rid of
probabilities (=uncertainties) where possible. 

> It all depends on the probability... I would much prefer to take the
> risk of a mathematically known infinitesimal probability (of the order
> of md5 hash collisions) than what most people in life take for granted
> as "absolute" fact. At least with a mathematically modeled system you
> know the risk which is more than most of us know about most other
> elements of our systems.
> 
>  > I won't trust such a file system for backup data.
> 
> Making blanket statements like that show a lack of understanding of
> probability vs. certainty in the world. 

Well, I just said, *I* won't trust such a file system. It's just a
gut feeling. Something which isn't logical or anything.

> If for example, the probability of a collision is many orders of
> magnitude less than the probability of you losing all your backups
> then I wouldn't worry about it. It all depends on the probability...

The bad thing about probabilities is that they don't tell you anything
about what will happen, just about what might happen. Even if the
probability is very, very, very, very small, it doesn't mean it will
not instantly happen the next second. It's just very unlikely.

Tino.

-- 
"What we nourish flourishes." - "Was wir nähren erblüht."

www.lichtkreis-chemnitz.de
www.craniosacralzentrum.de

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>