Extensive top-post trail deleted.
On 10/05/2011 02:39 AM, Daniel Sparrman wrote:
As with the hash conflict, the DD uses SHA-1 with a variable block
length for deduplication. Theoretically, there is a 2^160 chance it
will happen. Doesnt seem to be that bad, but your first hash
collision is randomly more likely to happen than that number
suggests.
I agree with your technical analysis, and I feel your disquiet. Waay
back in the '80s, I brought a (8mm :) tape to a meeting with a dept
official to say "One chance in a billion means to me that there are
five broken files on this tape".. The topic then was "should we make
copies of these?"
But I feel that you express these numbers in a vacuum which misleads.
The appropriate judgement has to be, not "Is an error possible?", but
"How risky is this?"; and that risk has to be compared to the other
risks you're taking.
I feel that you are focused on the unpredictably large impact of a
collision. "All my backups are gone!" is emotionally accessible to
any of us, and makes me shudder. But that scenario is not a plausible
result of a hash collision. Not that the reality is peachy: "Some
difficult-to identify set of my files are now corrupt" is quite bad
enough, thank you.
A 1/10^30 risk just doesn't have the same emotional availability. But
the homeopathic chances of it happening ought to temper the
resistance.
I would invoke the analogy of driving your car across the country
vs. taking an airplane; Many are paralyzed by the risks of air travel,
when the actuaries will tell you with great precision that you've a
better chance of dying in the drive _to the airport_ than once you've
taken off. Similarly, I'd guess that more DD failures have happened
due to physical violence than due to hash collisions.
- Allen S. Rout
|