BackupPC-users

Re: [BackupPC-users] Backing up a BackupPC server

2009-06-03 13:18:38
Subject: Re: [BackupPC-users] Backing up a BackupPC server
From: "Jeffrey J. Kosowsky" <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Wed, 03 Jun 2009 13:09:38 -0400
Tino Schwarze wrote at about 18:39:26 +0200 on Wednesday, June 3, 2009:
 > > > I recently heard about lessfs, which runs on top of FUSE to provide
 > > > a file system that does block-level de-duplication.  See:
 > > > 
 > > >     http://www.lessfs.com
 > > >     https://sourceforge.net/project/showfiles.php?group_id=257120
 > > >     http://tokyocabinet.sourceforge.net/index.html
 > > > 
 > > > The actual storage is several very large (sparse?) files on any
 > > > file system(s) of your choice.  It should provide all the benefits
 > > > you expect: no issues of local limitations on hardlink counts,
 > > > meta-data etc, and the database files can be copied or rsynced.
 > > > I'm corresponding with the author to see if some additional useful
 > > > features could be added.
 > 
 > Well, we've already got MD4 checksums of file blocks. And if I
 > understand everything correctly, we DO GET collisions, therefore the
 > hash chains.

First, the hash chains are based on *partial* file *md5* (not md4)
sums.

Second, the collisions only occur because the hash is only done on the first
and eighth (or last for small files) 128K block. So, obviously you will
have collisions for large files that have the same first and eighth
block. These collisions are not due to true md5 collisions -- i.e.,
two different blocks that have the same md5 block sum. In fact, I
would bet almost anything that no user of BackupPC has run into a case
where two members of a hash chain have *any* differences in their
first and eighth (or last for files <1MB) 128K blocks -- i.e. I
challenge anybody to find a true md5 collision in BackupPC data that
was not artificially constructed.

 > Of course, this if for 256k blocks, IIRC. And "only" 128 bit hashes.
 > But I don't like the idea of relying on probabilities. I've got enough
 > uncertainties by flaky hardware, bugs etc.

We rely on probabilities in all aspects of life. Nothing is certain.
It all depends on the probability... I would much prefer to take the
risk of a mathematically known infinitesimal probability (of the order
of md5 hash collisions) than what most people in life take for granted
as "absolute" fact. At least with a mathematically modeled system you
know the risk which is more than most of us know about most other
elements of our systems.

 > I won't trust such a file system for backup data.

Making blanket statements like that show a lack of understanding of
probability vs. certainty in the world. If for example, the
probability of a collision is many orders of magnitude less than the
probability of you losing all your backups then I wouldn't worry about
it. It all depends on the probability...

 > 
 > Tino.
 > 
 > -- 
 > "What we nourish flourishes." - "Was wir nähren erblüht."
 > 
 > www.lichtkreis-chemnitz.de
 > www.craniosacralzentrum.de
 > 
 > ------------------------------------------------------------------------------
 > OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
 > looking to deploy the next generation of Solaris that includes the latest 
 > innovations from Sun and the OpenSource community. Download a copy and 
 > enjoy capabilities such as Networking, Storage and Virtualization. 
 > Go to: http://p.sf.net/sfu/opensolaris-get
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users AT lists.sourceforge DOT net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    http://backuppc.wiki.sourceforge.net
 > Project: http://backuppc.sourceforge.net/

------------------------------------------------------------------------------
OpenSolaris 2009.06 is a cutting edge operating system for enterprises 
looking to deploy the next generation of Solaris that includes the latest 
innovations from Sun and the OpenSource community. Download a copy and 
enjoy capabilities such as Networking, Storage and Virtualization. 
Go to: http://p.sf.net/sfu/opensolaris-get
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>