ADSM-L

AIX sites: make sure you have this AIX APAR

2005-06-27 15:13:21
Subject: AIX sites: make sure you have this AIX APAR
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 27 Jun 2005 15:12:21 -0400
A big FYI for AIX 5.x sites...

I spent most of the weekend pursuing a problem in our TSM system
where HSM operations could not proceed. The whole AIX 5.2 system got
so screwed up in the kernel that 'ls -lR /opt' and 'ls -lR /usr'
would hang, and nfsd was accumulating hundreds of unserviceable
threads. After some reboots I was able to isolate the problem to a
"seed" area: directory /etc/adsm/SpaceMan/candidatesPool/, where an
'ls -l' on it would hang (loop, actually), which 'truss' showed to be
in a statx(). There were no errors in the AIX Error Log, the console,
or anywhere else. At Init 1, we rebuilt the directory and HSM could
proceed.

Today, finally, AIX evidenced errors in the Error Log:
JFS_META_CORRUPTION  and  JFS_FSCK_REQUIRED (on /). This finally gave
me a solid keyword.  That led me to APAR IY66404 for AIX 5.2, which
is quite new. (It has siblings for AIX 5.1 and 5.3.) Make sure you
have this applied to avoid this very nasty scenario resulting from an
AIX I/O serialization defect. You could end up with some unpleasant
data loss without the HIPER APAR.

   Richard Sims

<Prev in Thread] Current Thread [Next in Thread>