Re: DAT hardware or software compression
2002-11-22 15:10:22
On Friday 22 November 2002 16:28, Gene Heskett wrote:
> Which should be reason enough to see if the use of bzip2 could be
> incorporated into amanda. AIUI, bz2 can re-synch, losing only the
> actual file that the error effected.
Without even looking at the matter I can tell you that this is incorrect
because a bzipped tar file is simply a compressed stream of data - bzip2
doesn't have any understanding of the structure of a tar file so if it could
re-synch it couldn't do so to a granularity of one file in a tar file and
hence "lose only the actual file that the error affected".
But enough talk - how about some action. I took a bzipped tar file and made a
random single bit change in it. Such a corrupted bzipped tar behaves just
like a corrupted gzipped tar file - ticks along fine until you hit the error
and then you get tar errors like:
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
followed by bzip errors like:
bzip2: Data integrity error when decompressing.
Input file = (stdin), output file = (stdout)
bzip2 tells you that:
You can use the `bzip2recover' program to attempt to recover
data from undamaged sections of corrupted files.
so I did that - it produces a number of individual .bz2 files. Each of those
can be bunzipped (except for the one which contained the error - you get the
same error message, and the suggestion to use bzip2recover) and you can then
use cat to glue them back together and throw them through tar like:
cat rec00001file.tar rec00002file.tar . . . . recnnnnfile.tar | tar tf -
which has about the same effect as simply running tar tjf against the original
corrupt file - tar is happy until it reaches the missing corrupt section but
the gap is too much for it to bridge.
> Food for thought, since bz2 can recover a dropped bit or several...
But for our purposes, what it does is unfortunately about as much use as a
rubber hacksaw blade. So if you're itching to spend some time hacking on
amanda, adding bzip2 support is not going to help with bit errors on tapes.
But there again, this is why you use enought tapes to always have a couple of
level 0 to hand - to be sure, to be sure.
The only way to recover from single bit errors is to use an ECC code, which is
what decent tape drives do at a hardware level anyway. bzip2 or such a
compression program could of course add in an ECC code but this would make
the compressed file bigger because TANSTAAFL and as the purpose of a
compression program is to make files smaller, their authors haven't felt the
need to add this facility.
Sorry if I've burst anyone's bubble . . .
Kindest regards,
Niall O Broin
|
|
|