Hi Eric,
On Fri, 2005-02-18 at 16:30, Eric Siegerman wrote:
> On Fri, Feb 18, 2005 at 11:36:46AM +0000, Thomas Charles Robinson wrote:
> > [an excellently clear, concise, and complete [1] problem report
> > -- thank you! -- which included the following:]
> >
> > gzip: stdin: invalid compressed data--crc error
>
> All of tar's varied complaints appear to stem from corrupt input,
> which in turn is adequately explained by this message.
>
> Thus, either gzip or hardware looks like the culprit. RAM is a
> good place to look, especially considering that the data being
> backed up all resides on the Amanda server; you're giving that
> box quite a workout. The disk and its bus (SCSI, IDE, etc.) are
> possibilities too, but less likely IMO -- I'd expect the kernel
> to detect and report the I/O errors in that case.
>
This sounds feasible and I'm going to investigate the disks etc. for I/O
errors now. I've been suspecting maybe there's a memory problem
actually. I don't think the memory has in-built error correction so that
could be contributing. One thing to note is that I have successfully
verified a dump-file that previously gave errors indicating an
intermittent problem.
> Not to completely rule out problems with Amanda itself -- I've
> learned never to rule *anything* out where computers are
> concerned (or humans for that matter :-/) -- but it seems
> unlikely.
>
> As for gtar, 1.13.25 is well regarded on this list. 'Nuff said,
> until its input is known to be good. (After all, even if,
> hypothetically, tar were producing complete junk, gzip should be
> able to compress and decompress that junk without reporting CRC
> errors :-)
>
> > gzip-1.3.3-9
>
> ... is a beta. It might be worthwhile to try the latest released
> version, 1.2.4. From the web page, it looks as though that
> version can't handle files over 2 GB, so you'll have to split up
> any larger DLEs. Or just disable them for the duration of the
> test -- no loss; it's not as if you have usable backups of them
> now :-(
Beta! Ah, I didn't look at that! Has anyone else been using this version
successfully? I was under the impression (however naive) that gzip 1.3
was acceptable for amanda.
>
> Another useful test would be to temporarily disable software
> compression completely. That should fairly quickly tell you
> whether the corruption is occurring during gzipping (whether gzip
> itself or hardware is the ultimate source of the problem).
>
Will be testing this out. I will post when I have some results.
> > Lastly, I am currently using an nfs share for the holding disk but this
> > was NOT being used previously and I was still getting the corruption
> > mentioned.
>
> Hmm, did you ever run with local holding disk, while explicitly
> testing holding-disk files as you're doing now? I.e. was there
> ever a point where neither NFS nor the tape drive was in the
> loop? I'm wondering about the possibility that two independent
> sources of data corruption -- NFS and the tape subsystem -- might
> be confounding your attempts to isolate "the" problem.
I was trying all the manual checks before I started using the nfs
volume. Although it may be a factor I'm prepared to continue using the
volume at this stage until I've looked at archives without compression
and other I/O, disk issues.
Regards,
Tom.
>
> --
>
> | | /\
> |-_|/ > Eric Siegerman, Toronto, Ont. erics AT telepres DOT com
> | | /
> The animal that coils in a circle is the serpent; that's why so
> many cults and myths of the serpent exist, because it's hard to
> represent the return of the sun by the coiling of a hippopotamus.
> - Umberto Eco, "Foucault's Pendulum"
Art is a lie which makes us realize the truth. -- Picasso
signature.asc
Description: This is a digitally signed message part
|