I had all of my backups fail the other night

theconqueror

ADSM.ORG Member
Joined
Jul 10, 2008
Messages
36
Reaction score
0
Points
0
90% were missed, a handful failed and 3 completed. I'm new to TSM so I don't really know where to begin but I was checking through the activity log and noticed a lot of
ANR0538I A resource waiter has been aborted.
I went through and checked to see what happened before and after. Heres a big snippet of the log file just because I'm not totally sure what I'm looking at. There are a lot more in the log file but the log file it too big to post here


08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <258> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36737)
08/03/2008 16:01:07 ANR9999D ThreadId <258> issued message 9999 from:
(SESSION: 36737)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36737 for node
S-HDQEPOL02_SQL (TDP MSSQL Win32) - internal server error
detected. (SESSION: 36737)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <245> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36722)
08/03/2008 16:01:07 ANR9999D ThreadId <245> issued message 9999 from:
(SESSION: 36722)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36722 for node
S-HDQSQL04_SQL (TDP MSSQL) - internal server error
detected. (SESSION: 36722)
08/03/2008 16:01:07 ANR9999D smnode.c(15931): ThreadId <94> Unexpected rc=19
from imGetNextBackup (SESSION: 37277)
08/03/2008 16:01:07 ANR9999D ThreadId <94> issued message 9999 from: (SESSION:
37277)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <194> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36731)
08/03/2008 16:01:07 ANR9999D ThreadId <194> issued message 9999 from:
(SESSION: 36731)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <247> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36724)
08/03/2008 16:01:07 ANR9999D ThreadId <247> issued message 9999 from:
(SESSION: 36724)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <129> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36707)
08/03/2008 16:01:07 ANR9999D ThreadId <129> issued message 9999 from:
(SESSION: 36707)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <184> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36735)
08/03/2008 16:01:07 ANR9999D ThreadId <184> issued message 9999 from:
(SESSION: 36735)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36735 for node
S-HDQMAGIC01_SQL (TDP MSSQL Win32) - internal server
error detected. (SESSION: 36735)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <219> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36691)
08/03/2008 16:01:07 ANR9999D ThreadId <219> issued message 9999 from:
(SESSION: 36691)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <243> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36719)
08/03/2008 16:01:07 ANR9999D ThreadId <243> issued message 9999 from:
(SESSION: 36719)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <254> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36733)
08/03/2008 16:01:07 ANR9999D ThreadId <254> issued message 9999 from:
(SESSION: 36733)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <217> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36690)
08/03/2008 16:01:07 ANR9999D ThreadId <217> issued message 9999 from:
(SESSION: 36690)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36707 for node
S-HDQSQL01_SQL (TDP MSSQL Win32) - internal server error
detected. (SESSION: 36707)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <249> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36726)
08/03/2008 16:01:07 ANR9999D ThreadId <249> issued message 9999 from:
(SESSION: 36726)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36691 for node
S-HDQSQL02_SQL (TDP MSSQL Win32) - internal server error
detected. (SESSION: 36691)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36719 for node
S-HDQCOGDB02_SQL (TDP MSSQL Win32) - internal server
error detected. (SESSION: 36719)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36731 for node
S-HDQEMATDB01_SQL (TDP MSSQL Win32) - internal server
error detected. (SESSION: 36731)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36733 for node
S-HDQFOQA01_SQL (TDP MSSQL Win32) - internal server error
detected. (SESSION: 36733)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36724 for node
S-HDQCGOSPT06_SQL (TDP MSSQL Win32) - internal server
error detected. (SESSION: 36724)
08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <216> Unnexpected error
obtaining AUX bitfile information. (SESSION: 36689)
08/03/2008 16:01:07 ANR9999D ThreadId <216> issued message 9999 from:
(SESSION: 36689)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36689 for node
S-HDQMX01_SQL (TDP MSSQLV2 NT) - internal server error
detected. (SESSION: 36689)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36726 for node
S-HDQEDW01_SQL (TDP MSSQL Win32) - internal server error
detected. (SESSION: 36726)
08/03/2008 16:01:07 ANR0530W Transaction failed for session 36690 for node
S-HDQCOGDB01_SQL (TDP MSSQL Win32) - internal server
error detected. (SESSION: 36690)
08/03/2008 16:01:07 ANE4993E (Session: 36731, Node: S-HDQEMATDB01_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database IDMR_eMAT_Prod_Atlas from server
S-HDQEMATDB01 failed, rc = 418. (SESSION: 36731)
08/03/2008 16:01:07 ANE4991I (Session: 36731, Node: S-HDQEMATDB01_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database IDMR_eMAT_Prod_Polar from server
S-HDQEMATDB01. (SESSION: 36731)
08/03/2008 16:01:07 ANE4993E (Session: 36737, Node: S-HDQEPOL02_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database msdb from server S-HDQEPOL02 failed, rc =
418. (SESSION: 36737)
08/03/2008 16:01:07 ANE4991I (Session: 36737, Node: S-HDQEPOL02_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database Northwind from server
S-HDQEPOL02. (SESSION: 36737)
08/03/2008 16:01:07 ANE4993E (Session: 36690, Node: S-HDQCOGDB01_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database mary from server S-HDQCOGDB01 failed, rc =
418. (SESSION: 36690)
08/03/2008 16:01:07 ANE4991I (Session: 36690, Node: S-HDQCOGDB01_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database master from server S-HDQCOGDB01.
(SESSION: 36690)
08/03/2008 16:01:07 ANE4993E (Session: 36735, Node: S-HDQMAGIC01_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database msdb from server S-HDQMAGIC01 failed, rc =
418. (SESSION: 36735)
08/03/2008 16:01:07 ANE4991I (Session: 36735, Node: S-HDQMAGIC01_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database Northwind from server
S-HDQMAGIC01. (SESSION: 36735)
08/03/2008 16:01:08 ANE4993E (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
 
Last edited:
08/03/2008 16:01:08 ANE4993E (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
ACO3002 Data Protection for SQL: full backup of database
AtlasCSRpt from server S-HDQSQL04 failed, rc = 418.
(SESSION: 36722)
08/03/2008 16:01:08 ANE4993E (Session: 36707, Node: S-HDQSQL01_SQL) TDP MSSQL
Win32 ACO3002 Data Protection for SQL: full backup of
database CASS_II from server S-HDQSQL01 failed, rc = 418.
(SESSION: 36707)
08/03/2008 16:01:08 ANE4991I (Session: 36707, Node: S-HDQSQL01_SQL) TDP MSSQL
Win32 ACO3000 Data Protection for SQL: Starting full
backup of database DocServices from server S-HDQSQL01.
(SESSION: 36707)
08/03/2008 16:01:08 ANE4993E (Session: 36724, Node: S-HDQCGOSPT06_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database master from server S-HDQCGOSPT06 failed, rc =
418. (SESSION: 36724)
08/03/2008 16:01:08 ANE4991I (Session: 36724, Node: S-HDQCGOSPT06_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database Metrics from server
S-HDQCGOSPT06. (SESSION: 36724)
08/03/2008 16:01:08 ANE4993E (Session: 36719, Node: S-HDQCOGDB02_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database contributor from server S-HDQCOGDB02 failed,
rc = 418. (SESSION: 36719)
08/03/2008 16:01:08 ANE4991I (Session: 36719, Node: S-HDQCOGDB02_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database master from server S-HDQCOGDB02.
(SESSION: 36719)
08/03/2008 16:01:08 ANE4993E (Session: 36733, Node: S-HDQFOQA01_SQL) TDP
MSSQL Win32 ACO3002 Data Protection for SQL: full backup
of database master from server S-HDQFOQA01 failed, rc =
418. (SESSION: 36733)
08/03/2008 16:01:08 ANE4991I (Session: 36733, Node: S-HDQFOQA01_SQL) TDP
MSSQL Win32 ACO3000 Data Protection for SQL: Starting
full backup of database model from server S-HDQFOQA01.
(SESSION: 36733)
08/03/2008 16:01:08 ANE4993E (Session: 36726, Node: S-HDQEDW01_SQL) TDP MSSQL
Win32 ACO3002 Data Protection for SQL: full backup of
database EDWDimensions from server S-HDQEDW01 failed, rc
= 418. (SESSION: 36726)
08/03/2008 16:01:08 ANE4991I (Session: 36726, Node: S-HDQEDW01_SQL) TDP MSSQL
Win32 ACO3000 Data Protection for SQL: Starting full
backup of database ERP from server S-HDQEDW01. (SESSION:
36726)
08/03/2008 16:01:08 ANE4991I (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
ACO3000 Data Protection for SQL: Starting full backup of
database cgoaud from server S-HDQSQL04. (SESSION: 36722)
08/03/2008 16:02:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:02:07 ANR1029W Migration process 487 terminated for storage pool
EXCHDISK - lock conflict. (SESSION: 37286, PROCESS: 487)
08/03/2008 16:02:07 ANR0985I Process 487 for MIGRATION running in the
FOREGROUND completed with completion state FAILURE at
16:02:07. (SESSION: 37286, PROCESS: 487)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR1029W Migration process 487
(SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):terminated for storage pool
EXCHDISK - (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):lock conflict. (SESSION:
37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR0985I Process 487 for
MIGRATION (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):running in the FOREGROUND
completed (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):with completion state FAILURE
at (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):16:02:07. (SESSION: 37286)
08/03/2008 16:02:07 ANR1002I Migration for storage pool EXCHDISK will be
retried in 60 seconds. (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR1002I Migration for storage
pool (SESSION: 37286)
08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):EXCHDISK will be retried in 60
seconds. (SESSION: 37286)
08/03/2008 16:03:07 ANR1003I Migration retry delay ended; checking migration
status for storage pool EXCHDISK. (SESSION: 37286)
08/03/2008 16:03:07 ANR0984I Process 488 for MIGRATION started in the
FOREGROUND at 16:03:07. (SESSION: 37286, PROCESS: 488)
08/03/2008 16:03:07 ANR2110I MIGRATE STGPOOL started as process 488. (SESSION:
37286, PROCESS: 488)
08/03/2008 16:03:07 ANR1000I Migration process 488 started for storage pool
EXCHDISK manually, highMig=70, lowMig=0, duration=300.
(SESSION: 37286, PROCESS: 488)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR1003I Migration retry delay
ended; (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):checking migration status for
storage (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK)ool EXCHDISK. (SESSION:
37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR0984I Process 488 for
MIGRATION (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):started in the FOREGROUND at
16:03:07. (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR2110I MIGRATE STGPOOL
started as (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK)rocess 488. (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR1000I Migration process 488
started (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):for storage pool EXCHDISK
manually, (SESSION: 37286)
08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):highMig=70, lowMig=0,
duration=300. (SESSION: 37286)
08/03/2008 16:04:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:04:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:04:07 ANR0444W Protocol error on session 37276 for node
S-HDQEXGTY01 (WinNT) - out-of-sequence verb (type Data)
received. (SESSION: 37276)
08/03/2008 16:04:07 ANR0484W Session 37276 for node S-HDQEXGTY01 (WinNT)
terminated - protocol violation detected. (SESSION:
37276)
08/03/2008 16:04:22 ANR0406I Session 37413 started for node S-HDQEXGTY01
(WinNT) (Tcp/Ip s-hdqexgty01.aawh.atlasair.com(14867)).
(SESSION: 37413)
08/03/2008 16:05:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:05:07 ANE4021E (Session: 36550, Node: S-HDQCGOSPT2) Error
processing '\\s-hdqcgospt2\c$\adsm.sys\W2KReg\*': file
system not ready (SESSION: 36550)
08/03/2008 16:05:56 ANR0482W Session 36616 for node S-HDQAIMSWEB06 (WinNT)
terminated - idle for more than 300 minutes. (SESSION:
36616)
08/03/2008 16:07:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:07:07 ANE4021E (Session: 36612, Node: SHDQCOG01) Error
processing '\\s-hdqcog01\c$\ADSM.SYS\WMI\WMIDBFILE': file
system not ready (SESSION: 36612)
08/03/2008 16:07:07 ANR2121W ATTENTION: More than 1088 MB of the database has
changed and the last database backup was more than 24
hours ago. Use the BACKUP DB command to provide for
database recovery.
08/03/2008 16:13:07 ANR0538I A resource waiter has been aborted.
08/03/2008 16:13:53 ANR0406I Session 37414 started for node S-HDQBAKUP01
(WinNT) (Tcp/Ip S-HDQBAKUP01(2370)). (SESSION: 37414)
08/03/2008 16:13:53 ANR0403I Session 37414 ended for node S-HDQBAKUP01
(WinNT). (SESSION: 37414)
08/03/2008 16:39:31 ANR0406I Session 37415 started for node S-HDQREPL01
(WinNT) (Tcp/Ip s-hdqrepl01.aawh.atlasair.com(4798)).
(SESSION: 37415)
08/03/2008 17:00:01 ANR2561I Schedule prompter contacting S-HDQPS01 (session
 
It hasn't happened before to my knowledge. I rebooted the server this morning after I saw what happened...I'll see if it fixed everything in the morning
 
It looks like you lost contact with your storage. A SAN problem? the array? HBA?
What OS?
Any errors in the system logs?
Now that you've rebooted, have you done any testing, or are you just going to wait and see if it fails tonight? It'd better be the former.
So, do your DISK-class volumes varyon? Can you audit your FILE-class volumes - one on each filesystem? Did your disks even become available on reboot? If so, did the filesystems on them mount? I see that the server tried to migrate, so you probably need a migration, so , if the storage seems to be available, can you do migrations? You DON'T want to have to face your boss tomorrow "It failed night before last, and I rebooted, hoping it'd work last night, but that failed too. duuuuuuhhhh.....".
 
Its working just fine now. I fired off a couple manual backups the other day to test it out both before and after the reboot and they worked. I have no idea what happened, I went through a bunch of logs and none of them seemed to show anything. I saved the activity log so I can continue to go through it but it seems like it was just a fluke thing.
 
Back
Top