ADSM-L

Reclamation preempting reclamation?

2003-09-12 11:00:40
Subject: Reclamation preempting reclamation?
From: Chris Murphy <cmurphy AT IDL.STATE.ID DOT US>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 12 Sep 2003 09:01:06 -0600
Hello all,

Enviro: TSM Server 5.1.7.0 on Win2k SP4 (both servers)
SERVER1 source server, SERVER2 is the target server for the virtual volumes

Yesterday morning, during expiration, several reclamation processes were
kicked off for some onsite FILE volumes and several offsite virtual volumes
as well as a tape volume.  It appears from the log, that during the
reclamation of one of the offsite virtual volumes on the target server, a
process preempted the reclamation.  However, no processes were running, and
no restores were running on the target machine.  For some, reason, it
appears the target server cancels the session for the first reclamation with
the entry:

"ANR0494I Volume E:\TSMDATA\SERVER1\VV_BOISE_BKP1.DBB in use. Session 3610
for node SERVER1 (Windows) being preempted by higher priority operation."

Now, this volume was still mounted by process 292, as the mount retention
had not expired but the process had just ended.  As can be seen in the
ACTLOGS below, the second reclamation, process 294, attempts to reclaim its
volume, which happens to be located within the same FILE volume on the
target server.  It appears this process "preempts" the target server as the
volume was still mounted by the previous process, 292, which had just
completed.  So, my issues are this:

1.) Why would process 294 "preempt" process 292?  You can use the session
numbers (3610 for 292 and 3611 for 294) to see that it is really process 292
that is canceled, but it was *already* finished according to the logs on
the source server.

2.) As can then be seen further down, due to the cancellation, the source
server, SERVER1, sees this as a write error to the volume on the target.
Shouldn't this be handled more gracefully?  For instance, when reclamation
of a tape volume is preempted and canceled, the destination tape does not
experience a "write error" and get placed in READ ONLY mode.  So it appears
that when using virtual volumes, the process is not as graceful, and the
cancellation of the process by the target is not appropriately conveyed to
the source so that it may handle the issue gracefully.

SERVER 1 ACTLOG:

09/11/2003 08:10:08  ANR0984I Process 292 for SPACE RECLAMATION started in
the BACKGROUND at 08:10:08.
09/11/2003 08:10:08  ANR1040I Space reclamation started for volume
SERVER2.BFS.050166820, storage pool DIRECTORY_OFFSITE1 (process number 292).
09/11/2003 08:10:08  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001018.BFS
mounted.
09/11/2003 08:10:08  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001000.BFS
mounted.
09/11/2003 08:10:11  ANR1044I Removable volume
H:\TSMDATA\SERVER1\00001000.BFS is required for space reclamation.
09/11/2003 08:10:13  ANR8340I SERVER volume SERVER2.BFS.063289412 mounted.
09/11/2003 08:10:13  ANR1340I Scratch volume SERVER2.BFS.063289412 is now
defined in storage pool DIRECTORY_OFFSITE1.
09/11/2003 08:10:17  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001030.BFS
mounted.
09/11/2003 08:11:41  ANR8341I End-of-volume reached for FILE volume
H:\TSMDATA\SERVER1\00001018.BFS.
09/11/2003 08:11:41  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001033.BFS
mounted.
09/11/2003 08:12:32  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001030.BFS
mounted.
09/11/2003 08:13:10  ANR1041I Space reclamation ended for volume
H:\TSMDATA\SERVER1\00001000.BFS.
09/11/2003 08:13:10  ANR8340I FILE volume H:\TSMDATA\SERVER1\00001000.BFS
mounted.
09/11/2003 08:13:10  ANR1044I Removable volume
H:\TSMDATA\SERVER1\00000FFF.BFS is required for space reclamation.
09/11/2003 08:13:11  ANR8340I FILE volume H:\TSMDATA\SERVER1\00000FFF.BFS
mounted.
09/11/2003 08:13:11  ANR0986I Process 292 for SPACE RECLAMATION running in
the BACKGROUND processed 4 items for a total of 4,164 bytes with a
completion state of SUCCESS at 08:13:11.
09/11/2003 08:13:11  ANR1041I Space reclamation ended for volume
SERVER2.BFS.050166820.
09/11/2003 08:13:11  ANR0984I Process 294 for SPACE RECLAMATION started in
the BACKGROUND at 08:13:11.
09/11/2003 08:13:11  ANR1040I Space reclamation started for volume
SERVER2.BFS.050166820, storage pool DIRECTORY_OFFSITE1 (process number 294).
09/11/2003 08:13:11  ANR1044I Removable volume SERVER2.BFS.050166820 is
required for space reclamation.
09/11/2003 08:13:11  ANR8340I SERVER volume SERVER2.BFS.050166820 mounted.
09/11/2003 08:13:12  ANR4383E Session failure, target server SERVER has
aborted current transaction; reason: Transaction failed.
09/11/2003 08:13:12  ANR1411W Access mode for volume SERVER2.BFS.063289412
now set to "read-only" due to write error.
09/11/2003 08:13:12  ANR1181E astxn.c514: Data storage transaction
0:485863386 was aborted.
09/11/2003 08:13:13  ANR2183W afmove.c4235: Transaction 0:485863386 was
aborted.
09/11/2003 08:13:13  ANR0985I Process 294 for SPACE RECLAMATION running in
the BACKGROUND completed with completion state FAILURE at 08:13:13.
09/11/2003 08:13:13  ANR1093W Space reclamation terminated for volume
SERVER2.BFS.050166820 - transaction aborted.
09/11/2003 08:13:13  ANR1042I Space reclamation for storage pool
DIRECTORY_OFFSITE1 will be retried in 60 seconds.


SERVER2 ACTLOG:

09/11/2003 07:10:17   ANR0406I Session 3610 started for node
SERVER1(Windows) (Tcp/Ip 10.222.2.8(4672)).
09/11/2003 07:13:13   ANR8340I FILE volume
E:\TSMDATA\SERVER1\VV_BOISE_BKP1.DBB mounted.
09/11/2003 07:13:15   ANR0406I Session 3611 started for node
SERVER1(Windows) (Tcp/Ip 10.222.2.8(4673)).
09/11/2003 07:13:15   ANR0494I Volume E:\TSMDATA\SERVER1\VV_BOISE_BKP1.DBB
in use. Session 3610 for node SERVER1(Windows) being preempted by higher
priority operation.
09/11/2003 07:13:15   ANR0490I Canceling session 3610 for node
SERVER1(Windows).
09/11/2003 07:13:15   ANR0487W Session 3610 for node SERVER1(Windows)
terminated - preempted by another operation.
09/11/2003 07:13:15   ANR8340I FILE volume
E:\TSMDATA\SERVER1\VV_BOISE_BKP1.DBB mounted.
09/11/2003 07:14:16   ANR0406I Session 3612 started for node
SERVER1(Windows) (Tcp/Ip 10.222.2.8(4674)).
09/11/2003 07:14:17   ANR0406I Session 3613 started for node
SERVER1(Windows) (Tcp/Ip 10.222.2.8(4675)).

The servers are in different time-zones which is the reason of the 1 hour
difference in the time index.  ACTLOGs modified only to remove extraneous
entries.

Chris Murphy
IT Network Analyst
Idaho Dept. of Lands
Office: (208) 334-0293
Mobile: (208) 863-1275
cmurphy AT idl.state.id DOT us
>From 101 Reasons you cannot find your SysAdmin:
#80 - The admin is emptying the bit buckets

<Prev in Thread] Current Thread [Next in Thread>
  • Reclamation preempting reclamation?, Chris Murphy <=