Amanda-Users

Re: Multi-Gb dumps using tar + software compression (gzip)?

2004-10-20 15:22:48
Subject: Re: Multi-Gb dumps using tar + software compression (gzip)?
From: Eric Siegerman <erics AT telepres DOT com>
To: Amanda Mailing List <amanda-users AT amanda DOT org>
Date: Wed, 20 Oct 2004 12:52:12 -0400
On Wed, Oct 20, 2004 at 01:18:45PM +0200, Toralf Lund wrote:
> Other possible error sources that I think I have eliminated:
> [ 0. gzip ]  
>   1. tar version issues [...]
>   2. Network transfer issues [...]
>   3. Problems with a specific amanda version [...]
>   4. Problems with a special disk [...]

Of course it might well be hardware, as Paul suggested; but in
case it isn't, have you tried removing various of these pieces
from the pipeline entirely, e.g.:
  - create a multi-GB file on the client, gzip it, and see if it
    gunzip's ok

  - then ftp the .gz to the server and see if it gunzip's ok
    there too

  - then ftp the uncompressed version to the server, and both
    gzip and gunzip it there

  - or use netcat instead of ftp so that you can put the various
    gzip's and gunzip's in pipeline with the network transfer,
    thus more closely mimicking what Amanda does.  (Of course
    this won't make any difference -- but the whole point is to
    question assumptions like the one that begins this sentence!)

  - run gtar manually with the same options as Amanda would run
    it with, and see if you can untar the results

  - write a gtar wrapper that computes the MD5 of the tarball on
    its way through -- something like this (untested) script, the
    interesting parts of which are the use of tee(1) and a FIFO:
        mknod /securedirectory/FIFO$$ p

        echo $* >/securedirectory/sum$$ &
        md5sum </securedirectory/FIFO$$ >>/securedirectory/sum$$ &

        real-gtar ${1+"$@"} | tee /securedirectory/FIFO$$
        rm /securedirectory/FIFO$$
        
    Run Amanda with that wrapper installed on the client in place
    of the real gtar, with compression turned *off* for the DLE
    in question; then compare the MD5 of the tarball on tape with
    that computed by the tar wrapper.  (As someone (Paul?)
    alluded to, compression tends to make small errors noticeable
    because it magnifies them; this is a more dependable way to
    catch them, while removing one of the prime suspects -- the
    compression itself -- from the loop.)

  - etc, etc, etc.

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        erics AT telepres DOT com
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
        - Umberto Eco, "Foucault's Pendulum"