ADSM-L

Re: Difference between move data and backup stgpool

2004-04-07 21:32:29
Subject: Re: Difference between move data and backup stgpool
From: "Tantlevskiy,Sergey,GLENDALE,GLOBE Center AMS" <Sergey.Tantlevskiy AT US.NESTLE DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 7 Apr 2004 18:31:24 -0700
Andy,

BACKUP STGP does not check headers inside an 'object' on a tape, so if your
file is damaged somewhere in the middle, but both start and end of the file
good, then backup stgpool
will not complain and you will never know that both primary and copy
versions are bad until you do audit volume.

At least, this is how it was working in all version prior to 5.2.x. May be
they do check it now. BUT, in this case it is only important what was the
server version when BACKUP STGP was done.
I am not sure about MOVE DATA, you probably run it with reconstruct=no
(default)...so my guess is that MOVE DATA processing was enhanced in 5.2.x
and now it checks every header on its
way from the start till the end of the file. In my understanding, header is
written at the beginning of every 256KB block.

It would be interesting to know what IBM will tell you about this.

Sergey


-----Original Message-----
From: Andy Carlson [mailto:andyc AT ANDYC.CARENET DOT ORG]
Sent: Wednesday, April 07, 2004 04:13 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Difference between move data and backup stgpool


I have run into a sticky situation, and am wondering if I am on the
right track, and how I can figure out how pervasive this error is.

We recently installed 10 IBM 3592 tape drives to use as out new
copypool.  My choice was to copy the new data from disk to 3592, and
after some time copy the onsite tapes (STK9840) to the new 3592 drives.
I did this.  Now today, I had an error on one of the 9840 tapes.  The
error snippet is:

ANR1330E The server has detected possible corruption in an object being
restored or moved. The actual values for the incorrect frame are: magic
287F39D0 hdr version 33927 hdr length  1157 sequence number -69763148 data
length
4833CA95 server id 855943307 segment id ,((,0/((/(.0+,/-(*, crc 8CC3C886.
ANR1331E Invalid frame detected.  Expected magic 53454652 sequence
number 00000001 server id 00000000 segment id 0000000041896783.
ANR1165E Error detected for file in storage pool BACKCART: Node
SUPPLY-CONTRACTS, Type Backup, File space /u01, fsId 11, File name
/app/oracle/product/8.1.7/bin/ oracle.
ANR0548W Retrieve or restore failed for session 150690 for node
SUPPLY-CONTRACTS (AIX) processing file space /u01 11 for file
/app/oracle/product/8.1.7/bin/ oracle
stored as Backup - data integrity error detected.

So, I did a move data, which got all but the 2 files off the 100415
tape.  I did a restore volume 100415 preview=yes, which showed that tape
110173 had to be brought back from the vault to restore those two files.
Well, much to my surprise, I got the same error from the offsite volume:

ANR1330E The server has detected possible corruption in an object being
restored or moved. The actual values for the incorrect frame are: magic
287F39D0 hdr version 33927 hdr length  1157 sequence number -69763148 data
length 4833CA95 server id 855943307 segment id ,((,0/((/(.0+,/-(*, crc
8CC3C886.
ANR1331E Invalid frame detected.  Expected magic 53454652 sequence number
00000001 server id 00000000 segment id 0000000041896783.
ANR1330E The server has detected possible corruption in an object being
restored or moved. The actual values for the incorrect frame are: magic
24284321 hdr version 16775 hdr length  2833 sequence number 1184106498 data
length E20C8431 server id 683837521 segment id ('*',0(()..+-,/((*( crc
01E3C68D.
ANR1331E Invalid frame detected.  Expected magic 53454652 sequence number
00000001 server id 00000000 segment id 0000000041897176.
ANR1235I Restore process 1795 ended for volumes in storage pool BACKCART.

So, now that you have read all through this - is there some basic
difference between move data/restore and backup stgpool?  It seems like
the bad data on 100415 was copied to 110173 when I did backup stgpool.
Is there any way to fine out how pervasive this problem might be?

The particualars if it matters:
AIX 5.1ML4
TSM Server 5.2.1.1
100415 is an STK9840 tape cartridge
110173 is an IBM3592 tape cartridge
Client is an AIX box, with a slightly backlevel v4 client on it.

Thanks for any input.  I am going to call support tomorrow, but thought
I would poll the experts first.

Andy Carlson

<Prev in Thread] Current Thread [Next in Thread>