ADSM-L

CLEANUP EXPTABLE / SHOW VERIFYEXPTABLE

2006-04-06 16:08:20
Subject: CLEANUP EXPTABLE / SHOW VERIFYEXPTABLE
From: Josh-Daniel Davis <xaminmo AT OMNITECH DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 6 Apr 2006 15:03:01 -0500
Does anyone know how to tell how big the expiration table is?

The reason is that I ran CLEANUP EXPTABLE on Monday.
On one of my servers, it finished up almost immediately.
On the other, it's been running for almost 3 days.


Because of this, when I try to run EXPIRE INV, I get:

tsm: SERVER>expire inv
ANS8001I Return code 4.


tsm: SERVER>q act begint=-00:01
04/06/06 14:43:34 ANR2017I Administrator OPERATOR issued command: EXPIRE
   INVENTORY (SESSION: 239372)
04/06/06 14:43:34 ANR4298I Expiration thread already processing - unable
   to begin another expiration process. (SESSION: 239372)
04/06/06 14:43:34 ANR2017I Administrator OPERATOR issued command: ROLLBACK
   (SESSION: 239372)


It doesn't show up in Q PROC, and tracing IM* and more only shows failure to obtain the lock.


I know it's running because of SHOW THREAD and Q ACT.


SHOW THREAD will shows this:

Thread 129: ImVerifyExpTabThread
 tid=33076, ktid=2588793, ptid=0, det=1, zomb=0, join=0, result=0, sess=0
Awaiting cond waitP->waiting (0x18d5ffe20), using mutex TMV->mutex (0x111b091f8), in tmLock (0x100041a08)
  Stack trace:
    0x0900000000382554 _cond_wait_global
    0x0900000000382f64 _cond_wait
    0x0900000000383a2c pthread_cond_wait
    0x000000010000d91c pkWaitCondition
    0x0000000100041a10 tmLock
    0x000000010016cc5c ImLockFsId
    0x000000010016cafc ImLockFileSpace
    0x000000010067f168 LockFilespace
    0x0000000100681710 ImVerifyExpTabThread
    0x000000010000e9dc StartThread
    0x090000000036c50c _pthread_body



Q ACT has been showing many many of these message pairs:

04/06/06 13:19:04 CLEANUP EXPTABLE: **** resetting 'hasactive', objId=0:574058714 **** (SESSION: 134418)

04/06/06 13:19:04 CLEANUP EXPTABLE: !!!! 'HasActive' flag set incorrectly for objId=0:574877220 (\ADSM.SYS\ÿþÿÿÿþCLUSTERDBÿþÿÿÿþ), nodeName=NODENAME, fsName=\\nodename\c$ÿþÿÿÿþ !!!! (SESSION: 134418)

There are 125000 lines of this in the actlog in the last 67 hours.

The object IDs are not in order, so I have no idea how much longer it's going to run.

I'm assuming it will parse the entire Expiring.Objects table.


SHOW OBJDIR
... Expiring.Objects(78)


SHOW NODE 78
It's a b-tree root node with 99 subnodes.
It's 4 levels deep, and each node has a different number of children.
MaxCapacity is 1004, so potentially 1004^4.
I manually traversing the tree isn't feasible.


SHOW TREE Expiring.Objects
This just hangs for a long time.


I'm leaving it running, redirected to an outfile, but it's been 10 mins for both the node that completed the CLEANUP quickly and the node that didn't.

When expiration was OK, it would take 6-10 hours with SKIPD=YES and up to a day and a half with SKIPD=NO, vs 4 hours on the "good" node.

There's no CANCEL CLEANUP or similar.

I'm hesitant to kill off the server and restart it simply because of the number of objects it's correcting.

So, should I just wait for the SHOW TREE to complete, or is there some other, faster and more simple way to see?


Thanks for any assistance.

-Josh-Daniel Davis
<Prev in Thread] Current Thread [Next in Thread>