On Fri March 28 2003 12:46, Mike Simpson wrote:
>Hi --
>
>> Any tips or tricks or other thoughts? Is this the Linux
>> dump/restore problem I've seen talked about on the mailing list?
>> I don't understand how the gzip file could be corrupted by a
>> problem internal to the dump/restore cycle.
>
>Answering my own question after a week of testing ... I think I've
>discovered a bug in Amanda 2.4.4. This is what I've deciphered:
>
>(1) Restores of backup sets that compressed to < 1 gb worked fine.
> Backup sets that, when compressed, were > 1 GB blew up every
> time with gzip corruption error messages. This was consistent
> across OS's (Solaris 8, RedHat 7.x), filesystem types (ufs, vxfs,
> ext2/3), and backup modes (DUMP, GNUTAR).
>
>(2) The gzip corruption message always occured at the same spot,
> i.e.
>
> gzip: stdin: invalid compressed data--format violated
> Error 32 (Broken pipe) offset 1073741824+131072, wrote 0
>
> which is 1024^3 bytes + 128k. I note that in my Amanda
> configuration, I had "chunksize" defined to "1 gbyte" and
> "blocksize" set to "128 kbytes" (the chunksize was just for
> convenience, the blocksize seems to maximize my write
> performance).
>
>(3) I used "dd" to retrieve one of the compressed images that was
> failing. At the 1 gb mark in the file, the more-or-less
> random bytes of the compressed stream were interrupted by exactly
> 32k of zeroed bytes. I note that 32k is Amanda's default
> blocksize.
>
>(4) For last night's backups, I set "chunksize" to an arbitrarily
> high number, to prevent chunking, which works fine in my setup
> because I use one very large ext3 partition for all of my
> Amanda holding disk, which nullifies concerns about filesystem
> size and max file size. The restores I've done this morning have
> all worked fine, including the ones that had previously shown the
> corruption.
Well, after making a blithering idiot out of myself with the last 2
replies, (I've been doing too much work in hex lately) this does
sound as if you have nailed it. I've no idea how big your tapes
are, but if they handled a huge chunksize ok, then a retry at 2
gigs might be in order to confirm this, or maybe even half a gig
which should give a confirming result pretty quickly.
I don't recall running into that here as the huge majority of my
stuff is broken into subdirs that rarely exceed 800 megs. I also
didn't use an even chunk size, but is set nominally to 1/4 of a
DDS2 tape or 900something megs.
Interesting. Sounds like Jean-Louis or JRJ might want to look into
this one. Like you, I know just enough C to be dangerous, I'd
druther code in assembly, on a smaller machine...
Are you using the last snapshot from Jean-Louis's site at umontreal?
If not, maybe this has already been fixed. The latest one is dated
20030318. (Or was an hour ago :) I just checked the ChangeLog,
but didn't spot any references to something like this from now back
to about the middle of November last.
>I'm not enough of a C coder to come up with a real patch to fix
> this. I'm hoping the above gives enough clues to let someone who
> _is_ a real C coder do so.
>
>If this should be posted to the amanda-hackers list, please feel
> free to do so, or let me know and I'll do it. Also, if any other
> information would be helpful, just ask.
>
>Thanks,
>
>-mgs
--
Cheers, Gene
AMD K6-III@500mhz 320M
Athlon1600XP@1400mhz 512M
99.25% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attornies please note, additions to this message
by Gene Heskett are:
Copyright 2003 by Maurice Eugene Heskett, all rights reserved.
|