ADSM-L

Re: Restore Error - Help!

2004-01-19 09:55:24
Subject: Re: Restore Error - Help!
From: Tom Melton <Tom_Melton AT EMORYHEALTHCARE DOT ORG>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 19 Jan 2004 09:53:48 -0500
I have good news, my problem is fixed.  After having the issues with ADSM/TSM 
restores, I tried to mitigate my issues by ftp'ing the data to another machine. 
 After the ftp, the checksums on the files did not match.  I tried rcp and 
again, the checksums did not match.

So, armed with that information, I contacted IBM, and the problem was with the 
drivers for the gigabit ethernet cards.  The cards in question are 10/100/1000 
Base-TX PCI-X Adapter (14106902) and the driver at fault was 
devices.pci.14106902.rte at the 5.1.0.0 level.  IBM cited two apars, IY35540 at 
the 5.1.0.1 level and IY38866 at the 5.1.0.3 level.  IBM specifically said the 
5.1.0.3 level was needed to correct the problem.

As this was a production system, and reboots to get the driver and kernal 
changes was not a good option, the work-around was to set two attributes on the 
card to "no", those attributes being chksum_offload and large_send.

Turning off these two attributes, made ftp, rcp and adsm/tsm work correctly 
again.

Just wanted to pass along the details, if anyone ever searched for a similar 
problem.

-Tom 



>>Tom Melton sent on 01/15/04 at 15:58

Details (I know - backlevel) -

TSM client 4.2.3 32bit
Client OS AIX 5.1  5100-03

TSM server 4.1.4
Server OS AIX 4.3.2

Tape subsystem - SCSI attached 3590 256 track

An oracle database file was archived last night, and according to the log, it 
was archived fine.  Attempt to retrieve the file today, and the TSM client 
waits for the tape mount, tape gets mounted, and then the retrieve progresses 
to a point, and stops.  Then the session times-out due to commtimeout at 1200 
seconds.  

The dsmerror.log is full of the following:

01/15/04   11:55:37 The 103068111th code was found to be out of sequence.
The code (3432) was greater than (2259), the next available slot in the string 
table.
01/15/04   11:55:37 The 103068112th code was found to be out of sequence.
The code (3914) was greater than (2260), the next available slot in the string 
table.
01/15/04   11:55:37 The 103068114th code was found to be out of sequence.
The code (2472) was greater than (2262), the next available slot in the string 
table.
01/15/04   11:55:37 The 103068118th code was found to be out of sequence.
The code (3407) was greater than (2266), the next available slot in the string 
table.
01/15/04   11:55:37 The 103068120th code was found to be out of sequence.
The code (3463) was greater than (2268), the next available slot in the string 
table.
01/15/04   11:55:37 The 103068123th code was found to be out of sequence.
The code (2906) was greater than (2271), the next available slot in the string 
table.

I have retrieved other files off the tape.  The file that cannot be retrieved 
was compressed at the client.  The file that was retrieved successfully was NOT 
compressed at the client (grew during archive - then retried).

Any idea what this actually means?  I searched the archives at adsm.org, and 
saw several other posts, but no real explanation.  Any help would be 
appreciated.  

Thanks..

-Tom

<Prev in Thread] Current Thread [Next in Thread>