ADSM-L

[ADSM-L] Ang: ANR0361E Database initialization failed

2010-09-08 02:24:18
Subject: [ADSM-L] Ang: ANR0361E Database initialization failed
From: Daniel Sparrman <daniel.sparrman AT EXIST DOT SE>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 8 Sep 2010 08:26:16 +0200
Hi Lance

I'm not sure that the APAR you have linked is actually connnected to your 
problem. Your issue is that after the extend, a DB backup is triggered which 
fails to read the database, while the APAR fails immedadiatly upon format. The 
first APAR you linked is also about volume size over 250GB, not the entire 
database. I'm guessing your database voluimes are below 250GB.

Do you know if the volume /dev/rdsk/c0t7d0s4 was allocated before the extend, 
or if TSM started using it after you have extended the database?

Are you able to start TSM (while still getting the errors) or does the server 
crash upon starting it?

If you're unable to start the server, I'd say it's either a database restore, 
or, if you want to try to salvage the database, a database audit.

Best Regards

Daniel Sparrman

-----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----


Till: ADSM-L AT VM.MARIST DOT EDU
Från: Lance Nakata <LNakata AT SLAC.STANFORD DOT EDU>
Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
Datum: 09/08/2010 01:53
Ärende: ANR0361E Database initialization failed

I'm running Tivoli Storage Manager for Solaris x86_64 Version 5,
Release 5, Level 4.2 on a Sun X4540 "Thor".  I was at 79.8% DB
utilization:

tsm: TSM1>q db

Available   Assigned     Maximum     Maximum      Page       Total        Used  
   Pct    Max.
    Space   Capacity   Extension   Reduction      Size      Usable       Pages  
  Util     Pct
     (MB)       (MB)        (MB)        (MB)   (bytes)       Pages              
          Util
---------   --------   ---------   ---------   -------   ---------   ---------  
 -----   -----
  512,008    204,800     307,208      41,312     4,096   52,428,80   41,862,73  
  79.8    79.8
                                                                 0           7

So I decided to extend it by another 100GB for a total of 300GB:

tsm: TSM1>extend db 100000
ANR2248I Database assigned capacity has been extended.

The log indicated success but immediately started complaining:

ANR2248I Database assigned capacity has been extended.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c0t7d0s4, 
logical page 0 (physical page 256); actual: 67108864.
ANR9999D_2633473939 LvmNormalRead(lvmread.c:925) Thread<1404690>: Contents of 
page buffer:

ANR0207E Page address mismatch detected on database volume /dev/rdsk/c1t7d0s4, 
logical page 0 (physical page 256); actual: 67108864.
ANR9999D_2633473939 LvmNormalRead(lvmread.c:925) Thread<1404690>: Contents of 
page buffer:
ANR0248E Unable to read database page 0 from any alternate copy.
ANR9999D_1794209702 icStartBackup(ic.c:319) Thread<1404690>: Error reading 
space map page (smpNum=0, smpAddr=0) from disk.
ANR4581W Database backup/restore terminated - internal server error detected.
ANR4560I Triggered database backup will be retried in 60 seconds.

This error loops every 60 seconds.  I then found
http://www-01.ibm.com/support/docview.wss?uid=swg1IC61252 which
indicates this error (DB size over 262,140 MB) was fixed in 5.5.4 (I'm
running 5.5.4.2).

I found IBM note
http://www-01.ibm.com/support/docview.wss?uid=swg21386330 which said
to do a SET MIRRORREAD VERIFY in dsmserv.opt and restart TSM.
However, that does not work:

Tivoli Storage Manager for Solaris x86_64
Version 5, Release 5, Level 4.2

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2009.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR7800I DSMSERV generated at 12:43:45 on Mar 15 2010.
ANR7801I Subsystem process ID is 6464.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0200I Recovery log assigned capacity is 12296 megabytes.
ANR0201I Database assigned capacity is 304800 megabytes.
ANR0306I Recovery log volume mount in progress.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c0t7d0s4, 
logical page 0
(physical page 256); actual: 67108864.
ANR0207E Page address mismatch detected on database volume /dev/rdsk/c1t7d0s4, 
logical page 0
(physical page 256); actual: 67108864.
ANR0248E Unable to read database page 0 from any alternate copy.
ANR9999D_3229212849 DbAllocInit(dballoc.c:1353) Thread<1>: Error reading space 
map page from
disk.
ANR0361E Database initialization failed: error initializing database page 
allocator.

Any ideas on how to non-destructively recover from this?  If I could
somehow shrink the DB size to 250GB, it would apparently be happy
again?  All help appreciated.  Thank you.

Lance Nakata
SLAC National Accelerator Laboratory