If you are an HSM user with a server below level 2.1.0.13, beware
upgrading your server to higher levels, due to a severe memory leak
defect which has been in levels 13-15, and yet to be fixed. I had
not seen warning of this on ADSM-L and encountered it after upgrading
maintenance to level 15, hoping (in vain) to resolve longstanding HSM
operational problems. The symptoms are server messages:
ANR9999D pkshmem.c(753): Warning: attempt to free out of
range aligned block: Heap=3: Address=70100000,
Low=60000000, High=6003FFFF.
ANR0529W Transaction failed for session 2110 for node ____ (AIX) -
insufficient memory.
And in Backups of HSM file systems, the client suffers "Segmentation fault".
Have to Halt and restart server to regain memory space.
I found APARs IC19333, IX72593, IX72583 addressing the issue:
ABSTRACT:
IX72583: ANR9999D <PKSHMEM.C>(753): WARNING: ATTEMPT TO FREE OUT OF RANGE
ALIGNED BLOCK: HEAP=3: ADDRESS=AAAA, LOW=NNNN, HIGH=HHHH
ORIGINATING DETAILS:
Environment: ADSM for AIX servers, versions 2.1.0.13 and
2.1.0.15.
Scenario: Backup or archive of HSM migrated files.
Problem: When backing up or archiving HSM migrated files, the
following message appears on the ADSM server (there may be
multiple messages issued):
ANR9999D <pkshmem.c>(753): Warning: attempt to free out of
range aligned block: Heap=3: Address=70040000,
Low=60000000, High=6003FFFF.
(Note that the values for "Address", "Low", and "High", as well
as the number in parentheses to the right of "<pkshmem.c>" may
differ, depending on the server version and platform.)
This will cause a memory leak of 256 KB for every occurrence
of the ANR9999D message. For example, if you see 20 occurrences
of the message, this will result in a leak of 5 MB.
The problem occurs because ADSM is trying to free a buffer
to a heap that is different than the one from which it was
allocated. Module sstrans.c allocates buffers from the
XLargeHeap by calling pkAllocXLargeBuffer(). When it is
ready to free one of those buffers, it should call
pkFreeXLargeBuffer() in order to free it to the XLargeHeap.
Instead, it is calling pkFreeLargeBuffer() and trying to
free it to the LargeHeap. ADSM recognizes this condition and
issues the above ANR9999D message.
I think we'd all like to see more thorough code review and testing in
ADSM development, given the serious defects we've been seeing getting
out into the field.
Richard Sims, Boston University OIT
|