On Fri, Feb 18, 2005 at 04:49:32PM +0000, Thomas Charles Robinson wrote:
> An interesting point is that after a second run of my test 'some' of the
> dump-files verified as good. This indicates a intermittent problem.
> Would bad memory gives this type of behaviour?
Oh yeah! That sure smells like a hardware problem of some
sort...
BTW, I might have been wrong earlier about one thing, and
misleading about another:
- I said that the kernel would detect SCSI- or IDE-bus errors;
on second thought, I'm not so sure. It depends on the bus
and its age. Any semi-recent SCSI revision has parity
checking; though I know a lot less about IDE, I believe that
semi-recent versions of that do CRC checking. But old IDE's
don't have any bus-error detection mechanism at all, and in
truly ancient SCSI's it's optional. If a bus doesn't have
error correction, errors might well manifest as data
corruption instead of as kernel log messages :-/
- If you do indeed have a hardware problem, removing gzip from
the loop *might* remove just enough load from the machine to
stop the hardware from malfunctioning; so if the problem goes
away when you disable software compression, that *suggests* a
gzip problem, but doesn't *confirm* it. Of course you could
always run a few independent, long-running gzip's at the same time
as amdump to restore the system load -- you know, something
like:
gzip </dev/sda >/dev/null &
as many times as Amanda now runs simultaneous gzip's.
--
| | /\
|-_|/ > Eric Siegerman, Toronto, Ont. erics AT telepres DOT com
| | /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
- Umberto Eco, "Foucault's Pendulum"
|