I have good news, my problem is fixed. After having the issues with ADSM/TSM
restores, I tried to mitigate my issues by ftp'ing the data to another machine.
After the ftp, the checksums on the files did not match. I tried rcp and
again, the checksums did not match.
So, armed with that information, I contacted IBM, and the problem was with the
drivers for the gigabit ethernet cards. The cards in question are 10/100/1000
Base-TX PCI-X Adapter (14106902) and the driver at fault was
devices.pci.14106902.rte at the 5.1.0.0 level. IBM cited two apars, IY35540 at
the 5.1.0.1 level and IY38866 at the 5.1.0.3 level. IBM specifically said the
5.1.0.3 level was needed to correct the problem.
As this was a production system, and reboots to get the driver and kernal
changes was not a good option, the work-around was to set two attributes on the
card to "no", those attributes being chksum_offload and large_send.
Turning off these two attributes, made ftp, rcp and adsm/tsm work correctly
again.
Just wanted to pass along the details, if anyone ever searched for a similar
problem.
-Tom
>>Tom Melton sent on 01/15/04 at 15:58
Details (I know - backlevel) -
TSM client 4.2.3 32bit
Client OS AIX 5.1 5100-03
TSM server 4.1.4
Server OS AIX 4.3.2
Tape subsystem - SCSI attached 3590 256 track
An oracle database file was archived last night, and according to the log, it
was archived fine. Attempt to retrieve the file today, and the TSM client
waits for the tape mount, tape gets mounted, and then the retrieve progresses
to a point, and stops. Then the session times-out due to commtimeout at 1200
seconds.
The dsmerror.log is full of the following:
01/15/04 11:55:37 The 103068111th code was found to be out of sequence.
The code (3432) was greater than (2259), the next available slot in the string
table.
01/15/04 11:55:37 The 103068112th code was found to be out of sequence.
The code (3914) was greater than (2260), the next available slot in the string
table.
01/15/04 11:55:37 The 103068114th code was found to be out of sequence.
The code (2472) was greater than (2262), the next available slot in the string
table.
01/15/04 11:55:37 The 103068118th code was found to be out of sequence.
The code (3407) was greater than (2266), the next available slot in the string
table.
01/15/04 11:55:37 The 103068120th code was found to be out of sequence.
The code (3463) was greater than (2268), the next available slot in the string
table.
01/15/04 11:55:37 The 103068123th code was found to be out of sequence.
The code (2906) was greater than (2271), the next available slot in the string
table.
I have retrieved other files off the tape. The file that cannot be retrieved
was compressed at the client. The file that was retrieved successfully was NOT
compressed at the client (grew during archive - then retried).
Any idea what this actually means? I searched the archives at adsm.org, and
saw several other posts, but no real explanation. Any help would be
appreciated.
Thanks..
-Tom
|