ANR7807W; ANR0259E - TSM server does not start, but dbcopies are available

cabec

ADSM.ORG Member
Joined
Jul 28, 2003
Messages
19
Reaction score
0
Points
0
Location
Germany
Website
www.delphi.com
Hello @all.



I'm a little bit frustrated, because I don't understand the following phenomenon and Google didn't show me a solution (some topics found were not helpful).



After a disk crash, TSM doesn't come up any more. It complains:



ANR0900I Processing options file dsmserv.opt.

ANR7811I Direct I/O will be used for all eligible disk files.

ANR0990I Server restart-recovery in progress.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.06.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.12.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.07.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.08.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.09.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.10.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.11.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.01.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.02.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.03.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.04.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/db_rootvg.05.

A file or directory in the path name does not exist.

ANR7807W Unable to get information for file /home/tsm/db_rootvg/log_rootvg.01.

A file or directory in the path name does not exist.

ANR0259E Unable to read complete restart/checkpoint information from any database or recovery log volume.




The files mentioned are our primarie database volumes. But we have 2 additional dbcopy volumes coming via NFS. These are r/w mounted and accepted by TSM, becuase when theses directories are not mounted, TSM also complains these.



I edited the dsmserv.dsk and eliminated all /home/tsm/db_rootvg/db_rootvg* entries, but since TSM keeps track of the databse volumes inside, these files are marked as needed when the existing database vlumes are found during startup.



What I don't understand is: Why do I create these dbcopies when TSM doesn't start if one dbcopy is deleted? Then I don't need this stuff. Were are ma mistakes in thinking?



Nevertheless all other stuff like volume history, device configuration, incremental and full database backups are available. Is the only solution to do a restore of the databse with creating new empty database volumes via dsmfmt?



Waht will happen, when I do create the missing files with the same size and name again via dsmfmt and then just try to restart the TSM server? Will the new ones be marked as staled or offline? I don't want to play too much around with this before grabbing all informations I can get.



Thx for any help,

Carsten



PS: Please don't mind my mistakes/bad english. Hopefully things came over clearly enough to be understood.
 
try by "mirror Log verify"



if it won't work, assure you if you get the same error message!



and one more thing do never edit the dsmserdsk file,



and if you do dsmfmt you will loose all the data, so your mirror copies to, you will have to do a restoredb
 
Hello @all.



1st: TSM is back up and running. :grin:



But: The MIRRORREAD options didn't help. It was not possible to bring TSM back up with only the secondary and third database volume copies. :sad: TSM insists on the primary ones.



Now what have I done? The issue in the very beginning was, that a hdisk inside the rootvg (AIX 5.1) failed. (FYI: We had 2 hard disks as system disks inside the rootvg). But since I thought TSM could do the mirroring instead of AIX, I created the LV for the primary database volumes only on hdisk0 without mirroring.



During braking up the mirror of the rootvg we regonized this issue and copied via smitty the logical volume into a new one. But this new volume was not accessible because the filesystem was not set correctly. There was a missmatch between the ODM and the running system and the filesystem was always displayed with "???" (yes, 3 question marks). That was the situation when I tried to fix this and started this thread.



Since we also didn't get rid of this false LV, we contacted IBM and opend up a call. They gave us the solution to use a "synclvodm -v rootvg" in order to fix the difference between ODM and system. By doing this, the filesystem flga was set correctly and I had access again to the primary database volumes. Yes, I know, I'm a lucky guy. :grin:



This files were copied via find/cpio back to a new LV with (!) mirroring and TSM has been restarted successfully. So waht I learned is that the dbcopies may be good during runtime conditions of the TSM, but do not help TSM to start, when the primary dbv's are not available.



Now I'm changing currently our system a little bit. The primary dbv's are storred on the mirrored LV (mirroring is done by AIX). The secondary dbv's are coming over via NFS and will be used for security during runtime. But the 3rd dbcopies will be thrown away. This space and some more will beused for either full db backups or db snapshots. I'm not sure, what I will use.



Thx again for all help.
 
yep lucky guy,



i think it failed when resynchroning from your copie because your Primary Db volume was not accessible as describe in your post.



here we just tried the "Mirror DB verify" otopn to resynchronize a corrupt primairy DB, it work great
 
hi all!



I have another story about TSM database :)

3 days ago one of ours TSM server go down. it starts but got some exception and do like 'core dump' :) and exit. OS - windows 2003 sp1, tsm - 5.3. exception was when TSM try to read recovery log.

in windows event log i found many interesting records. so, what i fount:



about 2 AM some ugly software (i think it was MS windows ;) ) eat all nonpaged kernel memory (error 2019 from srv), after that TSM and OS lost connectivity to other world, BUT TSM WAS ALIVE!! Clients from outside get TCP/IP connection failure, OS lost MS Domain, MOM agent lost server and so on...



at 7.00 AM TSM successfully make FULL database backup (NO Connectivity and many errors in system and application log). and about 7 hours later TSM and OS LOST disks from SAN ARRAY where were TSM database\log and copy of DB and log. after that TSM lost DB\LOG and go down. Cant write to database and copies - going down...



about 9 PM SAN disks and network back to OS, TSM was down and many-many erros in windows event log about memory and etc.



as i sad above, TSM going down at startup, all database volumes and recovery log volumes are available (on next day after crash). No errors on disks (chkdsk).





TSM cant unload DB (dsmserv.exe unloaddb bla-bla)



Recovery procedures:



dsmserv.exe dumpdb (complete without errors, mean DB is OK), dsmserv.exe loadformat, dsmserv.exe loaddb, dsmserv.exe auditdb



after that TSM startup successfully and get to the work :)



questions:



1) WHY TSM CANT startup with valid database (RL Was in normal mode) and db\log copies ? dumpdb shows no errors, so DB was OK. am i right ?



2) Are there some alternative steps in my situation ? Maybe was another way to restore TSM ?



3) My steps for recover backed up information if were successful backup sessions before disks and lan was lost and no dbbackup ?
 
Hi again.

@may: I left the "mirrorread * verify" options in dsmserv.opt in order to increase data protection. Maybe it will help sometime in the future.

@blackminder: hmmm... dsmserv dumpdb is used when "a catastrophic error occurs (recovery log corruption, for example)" (from the admin guide). So maybe the db is ok, but the recovery lo wasn't. Did you try to use a "normal" restore db command?
 
Hi again.
@blackminder: hmmm... dsmserv dumpdb is used when "a catastrophic error occurs (recovery log corruption, for example)" (from the admin guide). So maybe the db is ok, but the recovery lo wasn't. Did you try to use a "normal" restore db command?
DB was OK, because DUMP DB return NO errors. server go to "core" at startup
i think i'll get this result with restore db.
but i dont uderstand why TSM fail ? There are some log and db volumes was mirrored. at startup TSM display messages about one(!) offile log volume and one(!) db volume. Those volumes was mirrored.

this is from startup screen. i'v got screenshot and can post it there if you need.

[Some standard TSM messages here.]

ANR05353I Recovery log analysis pass in progress.
Entering exception handler

thats all. I thought that Recovery log was corrupted and did dumpdb. dump and load was OK, audit find some errors and i ran auditdb. after audit all was OK:

ANR4140I AUDITDB: Database audit process started.
ANR4075I AUDITDB: Auditing policy definitions.
ANR4040I AUDITDB: Auditing client node and administrator definitions.
ANR4071I AUDITDB: Invalid sign-on attempts is not valid for ADMIN_CENTER.
ANR4135I AUDITDB: Auditing central scheduler definitions.
ANR3470I AUDITDB: Auditing enterprise configuration definitions.
ANR2833I AUDITDB: Auditing license definitions.
ANR4136I AUDITDB: Auditing server inventory.
ANR4138I AUDITDB: Auditing inventory backup objects.
ANR4137I AUDITDB: Auditing inventory file spaces.
ANR4307I AUDITDB: Auditing inventory external space-managed objects.
ANR4139I AUDITDB: Auditing inventory archive objects.
ANR2761I AUDITDB: auditing inventory virtual file space mappings.
ANR4310I AUDITDB: Auditing inventory space-managed objects.
ANR4134I AUDITDB: Processed 402412 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:05:00.
ANR4134I AUDITDB: Processed 793549 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:10:00.
ANR4134I AUDITDB: Processed 1568724 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:15:00.
ANR4134I AUDITDB: Processed 2492710 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:20:00.
ANR4207I AUDITDB: Object entry for expiring object 0.1729972 not found.
ANR4207I AUDITDB: Object entry for expiring object 0.1885543 not found.
ANR4207I AUDITDB: Object entry for expiring object 0.1913260 not found.
ANR4207I AUDITDB: Object entry for expiring object 0.2066050 not found.
ANR4207I AUDITDB: Object entry for expiring object 0.2066291 not found.
ANR4230I AUDITDB: Auditing data storage definitions.
ANR4264I AUDITDB: Auditing file information.
ANR4228E AUDITDB: Missing or incorrect information detected for file aggregate
(0.1980976).
ANR4134I AUDITDB: Processed 4164889 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:25:00.
ANR4228E AUDITDB: Missing or incorrect information detected for file aggregate
(0.1885523).
ANR4269E AUDITDB: Extraneous file data reference found.
ANR4228E AUDITDB: Missing or incorrect information detected for file aggregate
(0.1913025).
ANR4228E AUDITDB: Missing or incorrect information detected for file aggregate
(0.2070675).
ANR4228E AUDITDB: Missing or incorrect information detected for file aggregate
(0.2119588).
ANR4265I AUDITDB: Auditing disk file information.
ANR4266I AUDITDB: Auditing sequential file information.
ANR4134I AUDITDB: Processed 4842042 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:30:00.
ANR4271E AUDITDB: Missing or incorrect occupancy information detected.
ANR4256I AUDITDB: Auditing data storage definitions for disk volumes.
ANR4134I AUDITDB: Processed 4961170 entries in database tables and 7447328
blocks in bit vectors. Elapsed time is 0:35:00.
ANR4263I AUDITDB: Auditing data storage definitions for sequential volumes.
ANR6646I AUDITDB: Auditing disaster recovery manager definitions.
ANR4210I AUDITDB: Auditing physical volume repository definitions.
ANR4446I AUDITDB: Auditing address definitions.
ANR4141I AUDITDB: Database audit process completed.
 
just a word to cabec,
the "mirrorread * verify" is automaticaly remove at each TSM start...
the right thing is to use DB and Log Mirror and store them on an another disk than the original one
 
Back
Top