ADSM-L

3.1.2.30 detaile problem description

1999-07-21 19:42:12
Subject: 3.1.2.30 detaile problem description
From: Tom Brooks <tbrooks AT VNET.IBM DOT COM>
Date: Wed, 21 Jul 1999 16:42:12 PDT
The below further defines the EXPIRATION problem at ADSM Server Level
3.1.2.30. From this you should be able to determine how much your
installation is/is not impacted by the problem.
*****************************************************
A problem has been identified with ADSM server expiration
that impacts customers using the following server
levels:

3.1.2.23
3.1.2.24
3.1.2.30

Please note that 3.1.2.23 and 3.1.2.24 were fixtest
levels that had to be explicitly retrieved off
index.storsys.ibm.com.

The 3.1.2.30 level is the latest service level for
the server available as of June 30, 1999.

The problem documented by APAR PQ29033, is caused by a
change made to expiration processing for improved
performance.  An unintended side effect of the change is
that data that should be managed by the RETAIN ONLY
parameter in the copy group may instead be managed by the
RETAIN EXTRA parameter. This can cause incorrect deletion of
file versions from the server.

The expiration processing error occurs only under a specific
combination of conditions.

1) No active versions of a file exist on the server.
2) All of the inactive versions of the file have been
 inactive for at least the number of
days equal to the RETAIN EXTRA parameter value.
3) Expiration is not run frequently.

To help prevent the error from affecting the data on a server,
you can do the following:

1) Prevent expiration processing from running.
Set the server option "EXPINTERVAL" to zero (0).  This
disables automatic expiration.  This requires the server to be
halted and restarted in order to acknowledge the new
expiration interval.

Also any scheduled server administrative commands to run expiration should
be disabled.

2) Set the "RETAIN EXTRA" value to "NOLIMIT" in the needed
copygroups and reactivate the server policy sets.  This will
prevent expiration from incorrectly deleting files until the
fixed server is available.  This would also allow limited
expiration to be performed.  Specifically, expiration would
still be done for backup files that exceed the "VERSIONS EXIST"
policy attribute.  To make this change, issue the following commands:


"update copygroup xxx  yyy  zzz RETEXTRA=nolimit"  where xxx is
domain, yyy is policy set name, and zz is a management class name.
"activate policyset xxx yyy" where xxx is the domain name
and yyy is the policy set name referenced in the "update copygroup"
command above.

******************************************************
The following example illustrates the problem.
Assume the following situation:
source directory:  "/data"
files in directory: "file1"
Policy attributes for test:
 --> Versions exist:  4 versions
 --> Versions deleted:  2 versions
 --> Retain Extra: 5 days
 --> Retain Only: No limit
 Expiration  processing is run irregularly and not every day

(** Please note in the example below that the date
format mm/dd/yy is used)

The following sequence of events occurs:

1) On 07/20/99, the client performs an incremental backup
of the directory (command: incremental /data/)

This results in two entries stored on the server,
one for the directory and one for the file.
Please note that the entry for "file1" is
the active version of the file at this point.
The timestamp for "file1" is 07/20/99 12:00:00.

2) The client updates "file1"

3) The client again performs an incremental backup
of the directory (command: incremental /data/)

This results in one entry being sent to the
server.  In this case, the version of "file1" that was already
on the server becomes inactive,  and becomes the "oldest"
inactive copy.  The latest version of "file1" is
now the active copy of the file with a
timestamp of 07/20/99 13:00:00.



4) The client deletes "file1" from the directory.

5) For the third time, the client again performs an incremental backup
of the directory (command: incremental /data/)

 Because "file1" has been deleted from the client filesystem,
the server marks the latest version of "file1" as inactive.
The server now has two inactive versions of "file1".
this happens on 07/20/99 sometime after 13:00:00.

6) Five days pass. Expiration processing is not run during this time.

7) Expiration is run on 07/25/99 at 14:00:00.  With the current
error in the server's expiration processing, the server incorrectly
deletes inactive versions of "file1".

The server should recognize that the version with the timestamp
07/20/99 13:00:00 should be retained because it is the last
copy of the file on the server (RETAIN ONLY parameter set to NOLIMIT).
However, because of the error in the server, the server
groups the inactive files together without
evaluating whether one or more files may need to be managed
by using the RETAIN ONLY parameter.  The server incorrectly
evaluates whether to delete the inactive files based only
on the value of the RETAIN EXTRA parameter.
Because both versions of "file1" have been inactive for
5 days (RETAIN EXTRA parameter), the server deletes them both.

**********************************************************

Other factors that contribute to the processing error in this example
are:
Expiration was run only periodically - in the example
above, it was only run on 7/25/99.  It was not run
between 7/20/99 and 7/25/99.
All the versions of the file were eligible for deletion
under the "RETAIN EXTRA" parameter.

**********************************************************
The following example illustrates a situation where
the server would handle the file versions correctly:

source directory:  "/data"
files in directory: "file1"
Policy attributes for test:
 --> Versions exist:  4
 --> Versions deleted:  2
 --> Retain Extra: 5
 --> Retain Only: No_Limit
 Expiration is run each day at 20:00:00 and it runs to completion.

The following sequence of events occurs:

1) On 07/20/99, the client performs an incremental backup
of the directory (command: incremental /data/)

This results in two entries stored on the server.
One for the directory and one for the file.
Please note that the entry for "file1" is
the active version of the file at this point.
The timestamp for "file1" is 07/20/99 12:00:00.

2) The client updates "file1"

3) The next day, the client again performs an incremental backup
of the directory (command: incremental /data/)

This results in one entry being sent to the
server.  In this case, "file1" that was already
on the server is inactivated and becomes the "oldest"
inactive copy.  The latest version of "file1" is
now the active copy of the file with a
timestamp of 07/21/99 13:00:00.

4) The client deletes "file1" from the directory

5) On 7/22/99, the client performs an incremental backup
of the directory (command: incremental /data/)

This results in the active version of "file1" being
deactivated as there is no longer a version of the
file on the client filesystem.  Please note that
there are now no active copies of the data on the
server.  Please note that this happens on 07/22/99
sometime after 13:00:00.
6) Expiration runs normally on  07/25/99 at 20:00:00, in this case,
the oldest inactive file (the file with timestamp 07/20/99 12:00:00)
is deleted.  It is eligible for deletion because the "RETAIN EXTRA"
value of 5 days indicates that this file should not be kept after
5 days.

Please note that at this point, the copy of "file1" with the
timestamp 07/21/99 13:00:00 is not eligible for expiration.

7) Expiration runs normally on 07/26/99 at 20:00:00. In this
case, it determines that the copy of "file1" with timestamp
07/21/99 13:00:00 is the ONLY version of the file on the server.
It now correctly manages the file using the "RETAIN ONLY"
parameter and will keep the file indefinitely.

Reasons why expiration worked properly in this case:

1) All the versions of the file were not eligible for expiration
at the time that expiration was run.
2) Expiration was run on a regular interval.

*********************************************************


Colin Dawson
ADSM Server Development
redhead AT us.ibm DOT com

Tom Brooks
ADSM Server Development
tbrooks AT us.ibm DOT com
<Prev in Thread] Current Thread [Next in Thread>
  • 3.1.2.30 detaile problem description, Tom Brooks <=