Amanda-Users

Re: amrestore problem, headers ok but no data

2005-01-07 14:14:56
Subject: Re: amrestore problem, headers ok but no data
From: Eric Siegerman <erics AT telepres DOT com>
To: amanda-users AT amanda DOT org
Date: Fri, 7 Jan 2005 14:01:26 -0500
On Fri, Jan 07, 2005 at 11:17:21AM -0500, Brian Cuttler wrote:
> Following Gene's model, I set the default block size on the tape
> devices (sgi command # mt -f /dev/rmt/tps1d4nrns devblksz 32768)
> and also switched from the varable length to the fixed length tape
> device, used amlabel to relabel the tape (not what Gene indicated).
> 
> Oddly trying to dd if=/dev/rmt/tps... read no data
> 
> samar 85# mt -f /dev/rmt/tps1d4nrns rewind
> samar 86# dd if=/dev/rmt/tps1d4nrns of=scratch
> Read error: Invalid argument
> 0+0 records in
> 0+0 records out

These two things might well be related.  That dd command, without
a "bs=" argument, is trying to read 512-byte blocks.  But the
physical blocks on the tape are 32 KB -- your adjustments have
seen to that.  It would be appropriate for the read() call to
fail in that situation, as indeed it did.  On Solaris (whose man
pages I have access to at the moment), the error status would be
ENOMEM; perhaps on your system it's EINVAL == "Invalid argument"
instead.  (The place to look that up would likely be in the man
page for the tape driver -- st(7) is where I found the Solaris
version.)

> However, I ran amdump last night. Still having problems with TAR DLE
> though oddly I was able to see that a DUMP DLE attempted to write.

I'm lost.  "Attempted to write" what?  To tape during amdump, or
to disk during amrestore?  If the former, do you mean to say that
the tar DLE *didn't* attempt it?

> I was able to retrieve the file, using both amrestore and Eric's
> suggestion of manually issuing the dd command to get the file from
> tape. I was able to open the dump file (DLE for /usr1) and saw that
> the file "kmitra" was present. This I thought to be good news since
> the only top level file on the partition is kmitra/ (note directlry
> slash). Unfortuantely xfsdump reported the file as a regular file
> and not a directory and I was unable to proceed from there.

Something else you could try: "amrestore -r" one of those DLEs,
and "dd" it from tape as I described before.  Then "cmp" the two
files.  They should be identical of course.  That'll tell you
whether there are problems with amrestore.

To see whether amdump's back end (taper) is putting the data on
tape correctly, try this:
  - run amdump *with no tape in the drive*; it'll run in degraded
    mode and leave all the dumps in holding disk
  - make copies of the dumps in holding disk
  - amflush them (the originals, that is) to tape
  - "amrestore -r" them, and/or "dd" them from tape
  - compare what came off the tape with the holding-disk copies
    you made before the amflush (use "dd" to strip off and
    discard the first 32 KB of each file, as I described
    previously, because there *will* be differences between the
    files' Amanda headers; but the remainders of the two files
    should be identical)
    
If the holding-disk files are split into multiple chunks, you'll
have to do some "dd" magic to reassemble them; don't forget to
discard the first 32 KB of *every* chunk.

To see if amdump's front end (dumper et al) is getting the data
onto holding disk correctly in the first place, try to restore
from the holding disk copy, which hasn't made the journey
to-and-from the tape.  (Strip off the header and, if necessary,
reassemble as described above.)

> I've tried to retrieve several of the TAR DLE but have been unsuccessful
> with either method.

Sorry, but I gotta ask:  with or without "bs=32k"?

Hmm, it seems some of my recipes above are premature.  Oh well,
I'll leave them in anyway, since they might be useful further
down the road.

It sounds to me as though it's time to:
  - send you off looking at the debug files on the clients (in
    /tmp/amanda unless you've configured them otherwise); I'm not
    sure just what you'd be looking *for*; just anything
    unusual...

  - ask you to show us the following from a run that demonstrates
    the problem:
      - the email report
      - log.YYYYMMDD and amdump.N files
      - description of the results of restore attempts for a
        couple of representative DLEs

I can't recall:  how much testing have you done on the tape
subsystem itself, without Amanda in the loop to confuse things?

--

|  | /\
|-_|/  >   Eric Siegerman, Toronto, Ont.        erics AT telepres DOT com
|  |  /
The animal that coils in a circle is the serpent; that's why so
many cults and myths of the serpent exist, because it's hard to
represent the return of the sun by the coiling of a hippopotamus.
        - Umberto Eco, "Foucault's Pendulum"