ADSM-L

[ADSM-L] ANR0361E Database initialization failed

2010-09-07 19:55:13
Subject: [ADSM-L] ANR0361E Database initialization failed
From: Lance Nakata <LNakata AT SLAC.STANFORD DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 7 Sep 2010 16:53:34 -0700
I'm running Tivoli Storage Manager for Solaris x86_64 Version 5,
Release 5, Level 4.2 on a Sun X4540 "Thor".  I was at 79.8% DB
utilization:

tsm: TSM1>q db

Available   Assigned     Maximum     Maximum      Page       Total        Used  
   Pct    Max.
    Space   Capacity   Extension   Reduction      Size      Usable       Pages  
  Util     Pct
     (MB)       (MB)        (MB)        (MB)   (bytes)       Pages              
          Util
---------   --------   ---------   ---------   -------   ---------   ---------  
 -----   -----
  512,008    204,800     307,208      41,312     4,096   52,428,80   41,862,73  
  79.8    79.8
                                                                 0           7

So I decided to extend it by another 100GB for a total of 300GB:

tsm: TSM1>extend db 100000
ANR2248I Database assigned capacity has been extended.

The log indicated success but immediately started complaining:

ANR2248I Database assigned capacity has been extended.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c0t7d0s4, 
logical page 0 (physical page 256); actual: 67108864.
ANR9999D_2633473939 LvmNormalRead(lvmread.c:925) Thread<1404690>: Contents of 
page buffer:

ANR0207E Page address mismatch detected on database volume /dev/rdsk/c1t7d0s4, 
logical page 0 (physical page 256); actual: 67108864.
ANR9999D_2633473939 LvmNormalRead(lvmread.c:925) Thread<1404690>: Contents of 
page buffer:
ANR0248E Unable to read database page 0 from any alternate copy.
ANR9999D_1794209702 icStartBackup(ic.c:319) Thread<1404690>: Error reading 
space map page (smpNum=0, smpAddr=0) from disk.
ANR4581W Database backup/restore terminated - internal server error detected.
ANR4560I Triggered database backup will be retried in 60 seconds.

This error loops every 60 seconds.  I then found
http://www-01.ibm.com/support/docview.wss?uid=swg1IC61252 which
indicates this error (DB size over 262,140 MB) was fixed in 5.5.4 (I'm
running 5.5.4.2).

I found IBM note
http://www-01.ibm.com/support/docview.wss?uid=swg21386330 which said
to do a SET MIRRORREAD VERIFY in dsmserv.opt and restart TSM.
However, that does not work:

Tivoli Storage Manager for Solaris x86_64
Version 5, Release 5, Level 4.2

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2009.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7800I DSMSERV generated at 12:43:45 on Mar 15 2010.
ANR7801I Subsystem process ID is 6464.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0200I Recovery log assigned capacity is 12296 megabytes.
ANR0201I Database assigned capacity is 304800 megabytes.
ANR0306I Recovery log volume mount in progress.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c0t7d0s4, 
logical page 0
(physical page 256); actual: 67108864.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c1t7d0s4, 
logical page 0
(physical page 256); actual: 67108864.
ANR0248E Unable to read database page 0 from any alternate copy.
ANR9999D_3229212849 DbAllocInit(dballoc.c:1353) Thread<1>: Error reading space 
map page from
disk.
ANR0361E Database initialization failed: error initializing database page 
allocator.

Any ideas on how to non-destructively recover from this?  If I could
somehow shrink the DB size to 250GB, it would apparently be happy
again?  All help appreciated.  Thank you.

Lance Nakata
SLAC National Accelerator Laboratory