ADSM-L

URGENT: ADSM Version 3 Reclamation Problem (IX74458)

1998-01-27 17:56:20
Subject: URGENT: ADSM Version 3 Reclamation Problem (IX74458)
From: Michael Kaczmarski <kacz AT US.IBM DOT COM>
Date: Tue, 27 Jan 1998 17:56:20 -0500
On January 22, 1998, ADSM development discovered an error in the ADSM
Version 3 server reclamation processing for aggregated client files.
Under rare circumstances (less than 2% of all aggregates reconstructed
using default settings) the reclamation function will modify the last
buffer of data when reconstructing an aggregate file.  As a result, the
last file (or set of files if they are very small) in an aggregate may
not contain valid data.

A new construct was added in ADSM version 3 to improve performance and
lower overhead by automatically packaging small client files into larger
objects for management on the ADSM server.  The larger objects are
called "aggregates".  Reconstruction processing occurs during server
reclamation processing to remove vacant space from aggregates.  Vacant
space is introduced when one or more client files in an aggregate are
expired by storage management policy (e.g. versioning and/or retention).


THE FOLLOWING ARE *NOT* AFFECTED BY THIS PROBLEM:

   - Customers using Version 1 or Version 2 ADSM Servers
   - ADSM client space management data on any version of the ADSM server
   - Data backed up or archived  to a version 2 server
       prior to upgrade to version 3
   - Data that has never been reclaimed by a version 3 server


Updated servers to correct this problem are available via anonymous ftp
to index.storsys.ibm.com:

   AIX:        /adsm/fixes/v3r1/aixsrv/IX74458.bff
        (use SMIT to install the image)
   Windows NT: /adsm/fixes/v3r1/ntsrv/ix74458.exe
               /adsm/fixes/v3r1/ntsrv/ix74458.txt
               /adsm/fixes/v3r1/ntsrv/unpack.bat
   MVS :       A ZAP that can be applied to the MVS Version 3 server to
               correct the problem is located in
               /adsm/fixes/v3r1/mvssrv/IX74458.ZAP


The updated server will not reconstruct aggregates during reclamation.
Reclamation processing WILL copy aggregates to new media to free
up volumes, but empty space will still be left in the aggregates.

IBM recommends that you obtain and install the corrected version of the
server as soon as is possible.  If this cannot be done for some reason,
we recommend that you set your reclamation threshold to 100% to prevent
automatic reclamation, and use the MOVE DATA command to manually reclaim
media, if needed.


To Assist in Recovery from this Error:

In the next few weeks IBM will deliver an updated server with utility
commands that can be used to locate and deal with client files that have
been affected by this problem.  The functions will identify files that
may have been affected by this error so that they can be examined or
removed from the ADSM server.  These utility commands will be included
in all future version 3 server deliverables. Detailed documentation
will describe the use of the tools.


Common Questions:

1) What are the technical details of the problem ?

   During reclamation, file aggregates may be "reconstructed" so
   that space, which is vacant due to expiration, is "squeezed" out of the
   aggregate.  On rare occasions, a sequence of buffer copies leading up
   to the final buffer incorrectly calculates the source location for
   the data copy on the last buffer.  As a result, information is lost.

2) How prevalent is this problem ?

   Detailed calculations have shown that the chance of the error
   occurring are less than 1 to 2 times in 100 when an aggregate
   is reconstructed, using default settings.  Customers that are using
   the USELARGEBUFFERS NO option in their server options file will
   be affected to a much higher degree.  Since this is NOT a default
   setting, and can actually decrease performance, we doubt that many
   have chosen this option.

3) Will reclamation continue to work with the updated/ZAPed server ?

   Yes, media will be emptied when the reclamation threshold is met,
   but file aggregates will not be reconstructed.  This may leave empty
   space in the aggregates.

4) When will full reclamation work again ?

   When the updated server utilities are delivered to identify and deal
   with files that have been affected, you will be able to resume normal
   reclamation/reconstruction processing after finishing your analysis and
   cleaning up the utility entries in the database.

5) Will the AUDIT VOLUME detect this problem ?

   No. very few bytes at the end of an aggregated client file are affected
   when the error occurs.  The byte range affected is not checked by an
   AUDIT VOLUME command.

6) Can you absolutely determine which files have been affected ?

   We can determine with certainty which aggregates have been
   reconstructed.  We can determine if the aggregate was affected by
   the last reconstruction that occurred.  We cannot determine how many
   times an aggregate has been reconstructed.  The utilities that we
   develop will have options to deal with the possibility that multiple
   reconstruction operations may have affected aggregates.


The ADSM development team apologizes for the inconvenience and effort
that this problem may cause.  We will make every effort to assist in
your recovery from this situation.

Mike Kaczmarski
IBM Corporation
ADSM Development
kacz AT us.ibm DOT com
<Prev in Thread] Current Thread [Next in Thread>
  • URGENT: ADSM Version 3 Reclamation Problem (IX74458), Michael Kaczmarski <=