TSM 5.3.3 dsmserv segmentation fault AIX

aspankj

Active Newcomer
Joined
Jan 19, 2015
Messages
9
Reaction score
0
Points
0
PREDATAR Control23

Hello experts
We've got a power loss on TSM server for few times and i'm not able to start backup right now.
If I will start TSM (dsmserv) it's ok , but when i will start backup (dsmc inc) i will get a segmentation fault and it will not start.

I use TSM 5.3.3 on AIX , i run on root instance.

Log from dsmserv:
Code:
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770690, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003377cc bfDestroy <-0x000000010017cfc4 ImDeleteBitfile <-0x0000000100185b58 imDeleteObject <-0x0000000
100677ca4 DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
TSM:TSM_SERVER_COV1>
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770690, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29940 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29942 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29943 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29945 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29947 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29948 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR2828I Server is licensed to support Tivoli Storage Manager Basic Edition.
ANR0107W imexp.c(6574): Transaction 0:29949 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29951 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29952 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29954 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29956 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29958 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29959 was not committed due to an internal error.
ANR9999D afutil.c(1817): ThreadId<46> Row not found for bitfile aggregate 0.26770856, segnum=0.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001002e61bc AfUpdateLogOcc <-0x000000010025d580
BfUpdateAggrAttributes <-0x00000001003285b4 bfPrepareTxn <-0x0000000100068c54 CollectVotes <-0x00000001000699dc tmEnd <-0x0000000100678458
DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR0107W imexp.c(6574): Transaction 0:29961 was not committed due to an internal error.
ANR0407I Session 1 started for administrator AADAMUR (WinNT) (Tcp/Ip a725m009.ag.eu.jci.com(53013)).
ANR2017I Administrator AADAMUR issued command: select authentication from status
ANR0405I Session 1 ended for administrator AADAMUR (WinNT).
ANR0407I Session 2 started for administrator AADAMUR (WinNT) (Tcp/Ip a725m009.ag.eu.jci.com(53014)).
ANR2017I Administrator AADAMUR issued command: select start_time from sessions where start_time>(current_timestamp-1 minute)
ANR0405I Session 2 ended for administrator AADAMUR (WinNT).
ANR0407I Session 3 started for administrator AADAMUR (WinNT) (Tcp/Ip a725m009.ag.eu.jci.com(53016)).
ANR2103I Activity log pruning completed: 6454 records removed.
ANR0874E Backup object 0.26073224 not found during inventory processing.
ANR9999D imexp.c(4985): ThreadId<49> DetermineBackupRetention for 0:26073224 failed, rc=8
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100679e34 ExpirationQualifies <-0x000000010067be
20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationThread <-0x000000010000e9dc StartThread
<-0x090000000042b448 _pthread_body
ANR4391I Expiration processing node A6790X001, filespace /EMC/fs_johnson, fsId 13, domain STANDARD, and management class DEFAULT - for
BACKUP type files.
ANR4391I Expiration processing node A6790X001, filespace /EMC/fs_ldrover, fsId 14, domain STANDARD, and management class DEFAULT - for
BACKUP type files.
ANR0874E Backup object 0.26359542 not found during inventory processing.
ANR9999D imexp.c(4985): ThreadId<49> DetermineBackupRetention for 0:26359542 failed, rc=8
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100679e34 ExpirationQualifies <-0x000000010067be
20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationThread <-0x000000010000e9dc StartThread
<-0x090000000042b448 _pthread_body
ANR4391I Expiration processing node A6790X001, filespace /EMC/fs_system, fsId 16, domain STANDARD, and management class DEFAULT - for
BACKUP type files.
ANR9999D imexp.c(7655): ThreadId<49> No inactive versions found for 0:26878623
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100676170 DetermineBackupRetention <-0x000000010
0679e0c ExpirationQualifies <-0x000000010067be20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationTh
read <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR9999D imexp.c(4985): ThreadId<49> DetermineBackupRetention for 0:26878623 failed, rc=19
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100679e34 ExpirationQualifies <-0x000000010067be
20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationThread <-0x000000010000e9dc StartThread
<-0x090000000042b448 _pthread_body
ANR9999D imexp.c(7655): ThreadId<49> No inactive versions found for 0:25871047
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100676170 DetermineBackupRetention <-0x000000010
0679e0c ExpirationQualifies <-0x000000010067be20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationTh
read <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR9999D imexp.c(4985): ThreadId<49> DetermineBackupRetention for 0:25871047 failed, rc=19
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x0000000100679e34 ExpirationQualifies <-0x000000010067be
20 ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationThread <-0x000000010000e9dc StartThread
<-0x090000000042b448 _pthread_body
ANR9999D tb.c(3354): ThreadId<49> >>ERROR Database Page Format: Invalid sibling for page 103234, left sibling = 0.
ANR9999D ThreadId<49> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x00000001000b64b8 tbFetchNext <-0x000000010067c290
ExpirationProcess <-0x000000010067f0e4 ImDoExpiration <-0x000000010067f4fc ImExpirationThread <-0x000000010000e9dc StartThread <-0x0900000
00042b448 _pthread_body
ANR9999D imutil.c(6884): ThreadId<46> Bitfile id 0.26304664 not found.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x000000010017d024 ImDeleteBitfile <-0x0000000100185b58
imDeleteObject <-0x0000000100677ca4 DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body
ANR9999D imutil.c(6884): ThreadId<46> Bitfile id 0.26304664 not found.
ANR9999D ThreadId<46> issued message 9999 from:  <-0x000000010001c298 outDiagf <-0x000000010017d024 ImDeleteBitfile <-0x0000000100185b58
imDeleteObject <-0x0000000100678220 DeleteFilesThread <-0x000000010000e9dc StartThread <-0x090000000042b448 _pthread_body

ANR0406I Session 27 started for node A6790X001 (AIX) (Tcp/Ip a6790x001.ag.eu.jci.com(40201)).
Segmentation fault(coredump)


Thanks for your response.
Jaroslav
 
PREDATAR Control23

Hello,

looks like DB corruption - are you sure all your filesystems with TSM DB are OK - it is kind of weird the TSM server started - when the DB is corrupt then it usually refuses to start. But if the server crashes later (core dumps), it is very likely.
Do you have recent TSM DB backup?

Harry
 
PREDATAR Control23

Harry thanks for the quick response.
Yes , i restore db (-5 days) but problem still persist.
Backup start correctly :
ANS1898I ***** Processed 1,524,000 files *****
after 5 minutes
Retry # 1 Normal File--> 0 /EMC/fs_system/.etc/rpt_file.sids [Sent]
ANS1809W A session with the TSM server has been disconnected.
Segmentation fault(coredump)

Thanks
 
PREDATAR Control23

That's bad - what exactly fails - TSM server (looks like from the first output), TSM client (looks like from second output), both? Is it possible the AIX OS itself has a problem?
Is the client on the same machine as the server?
Anything in the OS logs? Can be normal, but is the file /EMC/fs_system/.etc/rpt_file.sids really 0 bytes? (are the filesystems OK?)

Harry
 
PREDATAR Control23

3 days before we got a major power loss and i'm not able to start backup
TSM server fails, i've checked all from the AIX OS, seems to be ok.
TSM server is direct connect to ibm tape library.
Actual situation
Actual dsm.opt file -- i removed domain ALL-LOCAL and dsmserver is running:
Servername tsm_cov
SUBDir yes
TAPEPROMPT yes

domain /EMC/fs_one
domain /EMC/fs_two
domain /EMC/fs_three
domain /EMC/fs_four
domain /EMC/fs_five
domain /EMC/fs_six
domain /EMC/fs_seven
But i got ANS1301E Server detected system error without segmentation fault.

q act search=err
01/19/2015 17:44:47 ANR0104E imbkins.c(5064): Error 2 deleting row from table "Expiring.Objects".(SESSION: 77)
01/19/2015 17:44:47 ANR0530W Transaction failed for session 77 for node A6790X001 (AIX) - internal server error detected.(SESSION: 77)
01/19/2015 17:44:47 ANE4961I (Session: 76, Node: A6790X001) Total number of bytes transferred: 20.37 MB(SESSION: 76)

dsmc inc log
Incremental backup of volume '/EMC/fs_one'

Incremental backup of volume '/EMC/fs_two'

Incremental backup of volume '/EMC/fs_three'

Incremental backup of volume '/EMC/fs_four'

Incremental backup of volume '/EMC/fs_five'

Incremental backup of volume '/EMC/fs_six'

Incremental backup of volume '/EMC/fs_seven'

ANS1898I ***** Processed 500 files *****
Successful incremental backup of '/EMC/fs_one'

ANS1898I ***** Processed 1,000 files *****
Successful incremental backup of '/EMC/fs_two'

ANS1898I ***** Processed 3,500 files *****
ANS1898I ***** Processed 5,500 files *****
Successful incremental backup of '/EMC/fs_three'

Successful incremental backup of '/EMC/fs_four'

ANS1898I ***** Processed 6,500 files *****
ANS1898I ***** Processed 9,500 files *****
ANS1898I ***** Processed 12,000 files *****
ANS1898I ***** Processed 14,500 files *****
ANS1898I ***** Processed 17,000 files *****

Successful incremental backup of '/EMC/fs_five'

ANS1898I ***** Processed 54,000 files *****
ANS1898I ***** Processed 55,500 files *****
ANS1898I ***** Processed 57,500 files *****

Successful incremental backup of '/EMC/fs_six'

ANS1898I ***** Processed 100,000 files *****
ANS1898I ***** Processed 102,500 files *****
ANS1898I ***** Processed 104,500 files *****
ANS1898I ***** Processed 105,000 files *****
Directory--> 2,048 /EMC/fs_seven/ [Sent]
Normal File--> 648 /EMC/fs_seven/ajonespa (a6790m001.ag.eu.jci.com) (U) - Shortcut.lnk [Sent]
Normal File--> 0 /EMC/fs_seven/.etc/rpt_file [Sent]
Normal File--> 0 /EMC/fs_seven/.etc/rpt_file.InProgress [Sent]
Normal File--> 0 /EMC/fs_seven/.etc/rpt_file.sids [Sent]
Normal File--> 50,688 /EMC/fs_seven/user/abenkama/Thumbs.db [Sent]
Normal File--> 2,253,707 /EMC/fs_seven/user/ahardwd/ddx_output/2581926_RNX1_V1_A1_S1_DP_Sheet_1.pdf [Sent]
Normal File--> 396,960 /EMC/fs_seven/user/ahardwd/ddx_output/2581926_RNX1_V1_A1_S1_DP_Sheet_22_(Detail).pdf [Sent]
Normal File--> 8,766,145 /EMC/fs_seven/user/ajonespa/RIVET_DETAILS.CATDrawing [Sent]
Directory--> 1,024 /EMC/fs_seven/user/ajonespa/ddx_input/Assembly Drawings 16 DEC 2014 [Sent]
Directory--> 1,024 /EMC/fs_seven/user/ajonespa/ddx_input/Foams 31 DEC 2014 [Sent]
Directory--> 2,048 /EMC/fs_seven/user/ajonespa/ddx_input/Part Drawings 06 JAN 2015 [Sent]
Normal File--> 9,865,808 /EMC/fs_seven/user/ajonespa/ddx_input/3133783-_A-JLR1495798-WELD_ASSY_FRONT_CROSSMEMBER___-PJ-16DEC2014.CATDrawing [Sent]
Normal File--> 26,624 /EMC/fs_seven/user/ajonespa/ddx_input/Thumbs.db [Sent]

Total number of objects inspected: 105,488
Total number of objects backed up: 0
Total number of objects updated: 0
Total number of objects rebound: 0
Total number of objects deleted: 0
Total number of objects expired: 0
Total number of objects failed: 0
Total number of bytes transferred: 20.37 MB
Data transfer time: 2.21 sec
Network data transfer rate: 9,398.95 KB/sec
Aggregate data transfer rate: 352.73 KB/sec
Objects compressed by: 0%
Elapsed processing time: 00:00:59
ANS1301E Server detected system error

q stg DISK1 is still on 0%
DISK1 DISK 10G 0.0 0.0 70 0 ONLINE1
DISK2 DISK 10G 0.0 0.0 70 0 ONLINE2

Right now dsmserver is still running, but when i'll try to run dsm inc ANS1301E Server detected system error

And yes /EMC/fs_system/.etc/rpt_file.sids is 0 bytes
Thanks.
 
PREDATAR Control23

Hi,

this is how it looks to me - if nothing happens (backup), server runs. During incremental backups there seems to be no error when files are "inspected" only (we can see 1.5M files being processed)- once a file should be transferred or file to be expired is found, there is an error - right?
You restored the TSM database from 5days ago ... this leads me to the conclusion database and log files were initialized and written to (therefore are not "read only").
You still get DB errors - so either the DB is corrupted and already was corrupted 5 days ago - in that case trying audit DB may help (may not ... and IBM will not help you as TSM 5.3.3 is long after support ...) OR there is read/write/fs error when accessing the DB ....
Can you run selective backup of ONE (big enough - fewMB) file?
Can you create file in the folder(s) where DB and RLOG files reside?
How does the "q db" and "q log" output look like?

Harry
 
PREDATAR Control23

Hi Harry
I tried to restore db but again segmentation fault
I created 1G file and restore was successful
Seems like problem with database , i'm little bit confused
q db& q log in attachement
Thanks
 

Attachments

  • db_log.JPG
    db_log.JPG
    32.4 KB · Views: 3
  • qlog.JPG
    qlog.JPG
    32.1 KB · Views: 3
PREDATAR Control23

Sorry i miss q db f=d & q log f=d
Thanks
 

Attachments

  • qlogf=l.JPG
    qlogf=l.JPG
    45.4 KB · Views: 2
  • qdbf=d.JPG
    qdbf=d.JPG
    69.1 KB · Views: 2
PREDATAR Control23

Hi , i'm still trying to auditdb but i get still seg default or coredump.
dsmserv auditdb fix=yes detail=no
ANR7800I DSMSERV generated at 15:02:17 on Jul 19 2006.
Tivoli Storage Manager for AIX-RS/6000
Version 5, Release 3, Level 3.3
Licensed Materials - Property of IBM
(C) Copyright IBM Corporation 1990, 2006.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.
ANR0900I Processing options file dsmserv.opt.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0200I Recovery log assigned capacity is 308 megabytes.
ANR0201I Database assigned capacity is 14144 megabytes.
ANR0306I Recovery log volume mount in progress.
ANR0353I Recovery log analysis pass in progress.
ANR0354I Recovery log redo pass in progress.
ANR0355I Recovery log undo pass in progress.
ANR0352I Transaction recovery complete.
ANR4140I AUDITDB: Database audit process started.
ANR4075I AUDITDB: Auditing policy definitions.
ANR4040I AUDITDB: Auditing client node and administrator definitions.
ANR4135I AUDITDB: Auditing central scheduler definitions.
ANR3470I AUDITDB: Auditing enterprise configuration definitions.
ANR2833I AUDITDB: Auditing license definitions.
ANR4136I AUDITDB: Auditing server inventory.
ANR4138I AUDITDB: Auditing inventory backup objects.
ANR4139I AUDITDB: Auditing inventory archive objects.
ANR4307I AUDITDB: Auditing inventory external space-managed objects.
ANR4137I AUDITDB: Auditing inventory file spaces.
ANR4310I AUDITDB: Auditing inventory space-managed objects.
ANR2761I AUDITDB: auditing inventory virtual file space mappings.
ANR4134I AUDITDB: Processed 292929 entries in database tables and 0 blocks in
bit vectors. Elapsed time is 0:05:00.
ANR4185I AUDITDB: Object entry for backup object 0.26526656 not found - entry
will be created.
ANR4185I AUDITDB: Object entry for backup object 0.26073215 not found - entry
will be created.
ANR9999D pkthread.c(598): ThreadId<35> Run-time assertion failed: "bufP->xLatc
hed", Thread 35, File buf.c, Line 895.
ANR9999D ThreadId<35> issued message 9999 from: <-0x000000010001c298 outDiagf
<-0x0000000100010434 pkLogicAbort <-0x00000001000c0898 bufUnlatch <-0x00000001
006b639c TbDirectInsert <-0x00000001000b5118 TbInsert <-0x00000001000b86d8
tbTableOp <-0x00000001003fc0d4 CheckBackupAttributes <-0x00000001003f6f60
AuditBackups <-0x00000001003ff3c4 ImAuditBackupsThread <-0x000000010000e9dc
StartThread <-0x090000000042b448 _pthread_body
ANR7838S Server operation terminated.
ANR7837S Internal error BUF012 detected.
0x0000000100010510 pkAbort
0x0000000100010440 pkLogicAbort
0x00000001000c0898 bufUnlatch
0x00000001006b639c TbDirectInsert
0x00000001000b5118 TbInsert
0x00000001000b86d8 tbTableOp
0x00000001003fc0d4 CheckBackupAttributes
0x00000001003f6f60 AuditBackups
0x00000001003ff3c4 ImAuditBackupsThread
0x000000010000e9dc StartThread
0x090000000042b448 _pthread_body
ANR7833S Server thread 1 terminated in response to program abort.
ANR7833S Server thread 2 terminated in response to program abort.
ANR7833S Server thread 3 terminated in response to program abort.
ANR7833S Server thread 4 terminated in response to program abort.
ANR7833S Server thread 5 terminated in response to program abort.
ANR7833S Server thread 6 terminated in response to program abort.
ANR7833S Server thread 7 terminated in response to program abort.
ANR7833S Server thread 8 terminated in response to program abort.
ANR7833S Server thread 9 terminated in response to program abort.
ANR7833S Server thread 10 terminated in response to program abort.
ANR7833S Server thread 11 terminated in response to program abort.
ANR7833S Server thread 12 terminated in response to program abort.
ANR7833S Server thread 13 terminated in response to program abort.
ANR7833S Server thread 14 terminated in response to program abort.
ANR7833S Server thread 15 terminated in response to program abort.
ANR7833S Server thread 16 terminated in response to program abort.
ANR7833S Server thread 17 terminated in response to program abort.
ANR7833S Server thread 18 terminated in response to program abort.
ANR7833S Server thread 19 terminated in response to program abort.
ANR7833S Server thread 20 terminated in response to program abort.
ANR7833S Server thread 21 terminated in response to program abort.
ANR7833S Server thread 22 terminated in response to program abort.
ANR7833S Server thread 23 terminated in response to program abort.
ANR7833S Server thread 24 terminated in response to program abort.
ANR7833S Server thread 25 terminated in response to program abort.
ANR7833S Server thread 26 terminated in response to program abort.
ANR7833S Server thread 27 terminated in response to program abort.
ANR7833S Server thread 28 terminated in response to program abort.
ANR7833S Server thread 29 terminated in response to program abort.
ANR7833S Server thread 30 terminated in response to program abort.
ANR7833S Server thread 31 terminated in response to program abort.
ANR7833S Server thread 32 terminated in response to program abort.
ANR7833S Server thread 33 terminated in response to program abort.
ANR7833S Server thread 34 terminated in response to program abort.
ANR7833S Server thread 35 terminated in response to program abort.
ANR7833S Server thread 36 terminated in response to program abort.
ANR7833S Server thread 37 terminated in response to program abort.
ANR7833S Server thread 38 terminated in response to program abort.
IOT/Abort trap(coredump)
I have a only one database version, exist a procedure how to repair this database ?
Thanks
 
PREDATAR Control23

Hi,

1st - this is bad - DB seems to be corrupted and you are using TSM version which is unsupported for the long time - so no support from the IBM.
I take it you are willing to try anything as you have nothing to lose.
Well - still create (keep) your DB backup - if anyone has better idea, you can have a point to revert to.

so - what are the options ....
a) build another server (maybe just another instance on the same hardware) and try to export as much data from the original server to the new one - can be long time job, you may need to configure library sharing, another storage etc. But you are on the safest side - new backups can be taken to the new instance and what is exported successfully is saved ... (and nothing is deleted on the original one)
b) use DSMSERV DUMPDB/DSMSERV LOADFORMAT/DSMSERV LOADDB and DSMSERV AUDITDB procedure (described in Admin reference - this is the windows version but process is generally the same ftp://ftp.uni-potsdam.de/pub/misc/ibm/tsm/doc/Admin-Reference-TSM-53.pdf - look for AIX version)

Harry
 
PREDATAR Control23

Against me :(
I have a last one question(without support i'm trying to fix it by myself).
Exist any chance to edit the database ? I'm trying to find some software or distribution where it's possible to open & edit the database.
My theory is (is it wrong ?) if i will be able to edit the database manually and delete a bad object , database can be work.
Cause when i did audit db fix=no,detail=yes , command worked , but when i did audit db fix=yes detail=yes, auditdb tried to fix it , it fix first object 0.26073215 and then segment fault.
Log from audit database with 'bad' object
Thank you a lot
 

Attachments

  • audit_result.txt
    77.7 KB · Views: 4
PREDATAR Control23

Hi,
TSM DB used in pre-6.x versions is TSM proprietary thing and cannot be modified with any tool I know.
Have you tried the DUMPDB .... procedure?

now the ... dirty stuff .... you have been warned, do not use etc.
What you can try is to manually delete the object(s) from the DB ... which is very risky and you probably break your DB even more - so try it first on DB restored to the test environment.
Check this http://www.tsmadmin.com/2007/06/oracle-rman-catalogue-cleanup.html
(important part is "delete object ..." command)

Harry
 
PREDATAR Control23

Well, that was my last idea :(
I've tried the dumpdb proces, looks good,DUMPDB, LOADFORMAT, LOADDB ok, but auditdb no :(
DSMSERV DUMPDB
dsmserv dumpdb devclass=dc_lto1 volumenames=000054
ANR7800I DSMSERV generated at 15:02:17 on Jul 19 2006.

Tivoli Storage Manager for AIX-RS/6000
Version 5, Release 3, Level 3.3

Licensed Materials - Property of IBM

(C) Copyright IBM Corporation 1990, 2006.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR0900I Processing options file dsmserv.opt.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0200I Recovery log assigned capacity is 308 megabytes.
ANR0201I Database assigned capacity is 14144 megabytes.
ANR4000I DUMPDB: Database dump process started.
ANR8200I TCP/IP driver ready for connection with clients on port 1500.
ANR4013I DUMPDB: Dumped 0 database entries (cumulative).
ANR8337I LTO volume 000054 mounted in drive LTO_DRV1 (/dev/rmt0).
ANR1360I Output volume 000054 opened (sequence number 1).
ANR4013I DUMPDB: Dumped 4634268 database entries (cumulative).
ANR4013I DUMPDB: Dumped 9264838 database entries (cumulative).
ANR4013I DUMPDB: Dumped 14030007 database entries (cumulative).
ANR4013I DUMPDB: Dumped 20116494 database entries (cumulative).
ANR4013I DUMPDB: Dumped 24952855 database entries (cumulative).
ANR1361I Output volume 000054 closed.
ANR4013I DUMPDB: Dumped 27111934 database entries (cumulative).
ANR8336I Verifying label of LTO volume 000054 in drive LTO_DRV1 (/dev/rmt0).
ANR8468I LTO volume 000054 dismounted from drive LTO_DRV1 (/dev/rmt0) in
library LTO_LIB1.
ANR4031I DUMPDB: Copied 1141386 database pages.
ANR4033I DUMPDB: Copied 160 bit vectors.
ANR4034I DUMPDB: Encountered 0 bad database pages.
ANR4036I DUMPDB: Copied 27111934 database entries.
ANR4037I DUMPDB: 2829 Megabytes copied.
ANR4001I DUMPDB: Database dump process completed.

But auditdb
/usr/tivoli/tsm/server/bin/dsmserv auditdb fix=yes
ANR7800I DSMSERV generated at 15:02:17 on Jul 19 2006.

Tivoli Storage Manager for AIX-RS/6000
Version 5, Release 3, Level 3.3

Licensed Materials - Property of IBM
(C) Copyright IBM Corporation 1990, 2006.
All rights reserved.
U.S. Government Users Restricted Rights - Use, duplication or disclosure
restricted by GSA ADP Schedule Contract with IBM Corporation.

ANR0900I Processing options file dsmserv.opt.
ANR4726I The ICC support module has been loaded.
ANR0990I Server restart-recovery in progress.
ANR0200I Recovery log assigned capacity is 200 megabytes.
ANR0201I Database assigned capacity is 14144 megabytes.
ANR0306I Recovery log volume mount in progress.
ANR0353I Recovery log analysis pass in progress.
ANR0354I Recovery log redo pass in progress.
ANR0355I Recovery log undo pass in progress.
ANR0352I Transaction recovery complete.
ANR4140I AUDITDB: Database audit process started.
ANR4075I AUDITDB: Auditing policy definitions.
ANR4127I AUDITDB: Policy global attributes cannot be found - attributes will
recreated.
ANR9999D pmaudit.c(680): ThreadId<0> AUDITDB: Policy domain not found.
ANR9999D ThreadId<0> issued message 9999 from: <-0x000000010001c298 outDiagf
<-0x00000001006e3b18 pmAuditDomain <-0x00000001006e3e18 pmAudit <-0x0000000100
215804 admAuditServer <-0x0000000100003ffc AuditServer <-0x0000000100002e44
main
ANR9999D pmaudit.c(289): ThreadId<0> pmAuditDomain failed, rc=1
ANR9999D ThreadId<0> issued message 9999 from: <-0x000000010001c298 outDiagf
<-0x00000001006e3e3c pmAudit <-0x0000000100215804 admAuditServer <-0x000000010
0003ffc AuditServer <-0x0000000100002e44 main
ANR9999D admstart.c(3483): ThreadId<0> pmAudit failed, rc=1
ANR9999D ThreadId<0> issued message 9999 from: <-0x000000010001c298 outDiagf
<-0x0000000100215828 admAuditServer <-0x0000000100003ffc AuditServer
<-0x0000000100002e44 main
ANR4142I AUDITDB: Database audit process terminated in error.
ANR4134I AUDITDB: Processed 1 entries in database tables and 0 blocks in bit
vectors. Elapsed time is 0:00:10.

I was no able to start dsmserv after auditdb

I think it's over .
 
Top