1. Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING) Click the link to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This message will disappear after you have made at least 12 posts. Thank you for your cooperation.

I had all of my backups fail the other night

Discussion in 'Backup / Archive Discussion' started by theconqueror, Aug 4, 2008.

  1. theconqueror

    theconqueror New Member

    Joined:
    Jul 10, 2008
    Messages:
    36
    Likes Received:
    0
    90% were missed, a handful failed and 3 completed. I'm new to TSM so I don't really know where to begin but I was checking through the activity log and noticed a lot of
    ANR0538I A resource waiter has been aborted.
    I went through and checked to see what happened before and after. Heres a big snippet of the log file just because I'm not totally sure what I'm looking at. There are a lot more in the log file but the log file it too big to post here


    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <258> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36737)
    08/03/2008 16:01:07 ANR9999D ThreadId <258> issued message 9999 from:
    (SESSION: 36737)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36737 for node
    S-HDQEPOL02_SQL (TDP MSSQL Win32) - internal server error
    detected. (SESSION: 36737)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <245> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36722)
    08/03/2008 16:01:07 ANR9999D ThreadId <245> issued message 9999 from:
    (SESSION: 36722)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36722 for node
    S-HDQSQL04_SQL (TDP MSSQL) - internal server error
    detected. (SESSION: 36722)
    08/03/2008 16:01:07 ANR9999D smnode.c(15931): ThreadId <94> Unexpected rc=19
    from imGetNextBackup (SESSION: 37277)
    08/03/2008 16:01:07 ANR9999D ThreadId <94> issued message 9999 from: (SESSION:
    37277)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <194> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36731)
    08/03/2008 16:01:07 ANR9999D ThreadId <194> issued message 9999 from:
    (SESSION: 36731)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <247> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36724)
    08/03/2008 16:01:07 ANR9999D ThreadId <247> issued message 9999 from:
    (SESSION: 36724)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <129> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36707)
    08/03/2008 16:01:07 ANR9999D ThreadId <129> issued message 9999 from:
    (SESSION: 36707)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <184> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36735)
    08/03/2008 16:01:07 ANR9999D ThreadId <184> issued message 9999 from:
    (SESSION: 36735)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36735 for node
    S-HDQMAGIC01_SQL (TDP MSSQL Win32) - internal server
    error detected. (SESSION: 36735)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <219> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36691)
    08/03/2008 16:01:07 ANR9999D ThreadId <219> issued message 9999 from:
    (SESSION: 36691)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <243> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36719)
    08/03/2008 16:01:07 ANR9999D ThreadId <243> issued message 9999 from:
    (SESSION: 36719)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <254> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36733)
    08/03/2008 16:01:07 ANR9999D ThreadId <254> issued message 9999 from:
    (SESSION: 36733)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <217> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36690)
    08/03/2008 16:01:07 ANR9999D ThreadId <217> issued message 9999 from:
    (SESSION: 36690)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36707 for node
    S-HDQSQL01_SQL (TDP MSSQL Win32) - internal server error
    detected. (SESSION: 36707)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <249> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36726)
    08/03/2008 16:01:07 ANR9999D ThreadId <249> issued message 9999 from:
    (SESSION: 36726)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36691 for node
    S-HDQSQL02_SQL (TDP MSSQL Win32) - internal server error
    detected. (SESSION: 36691)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36719 for node
    S-HDQCOGDB02_SQL (TDP MSSQL Win32) - internal server
    error detected. (SESSION: 36719)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36731 for node
    S-HDQEMATDB01_SQL (TDP MSSQL Win32) - internal server
    error detected. (SESSION: 36731)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36733 for node
    S-HDQFOQA01_SQL (TDP MSSQL Win32) - internal server error
    detected. (SESSION: 36733)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36724 for node
    S-HDQCGOSPT06_SQL (TDP MSSQL Win32) - internal server
    error detected. (SESSION: 36724)
    08/03/2008 16:01:07 ANR9999D bfutil.c(3442): ThreadId <216> Unnexpected error
    obtaining AUX bitfile information. (SESSION: 36689)
    08/03/2008 16:01:07 ANR9999D ThreadId <216> issued message 9999 from:
    (SESSION: 36689)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36689 for node
    S-HDQMX01_SQL (TDP MSSQLV2 NT) - internal server error
    detected. (SESSION: 36689)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36726 for node
    S-HDQEDW01_SQL (TDP MSSQL Win32) - internal server error
    detected. (SESSION: 36726)
    08/03/2008 16:01:07 ANR0530W Transaction failed for session 36690 for node
    S-HDQCOGDB01_SQL (TDP MSSQL Win32) - internal server
    error detected. (SESSION: 36690)
    08/03/2008 16:01:07 ANE4993E (Session: 36731, Node: S-HDQEMATDB01_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database IDMR_eMAT_Prod_Atlas from server
    S-HDQEMATDB01 failed, rc = 418. (SESSION: 36731)
    08/03/2008 16:01:07 ANE4991I (Session: 36731, Node: S-HDQEMATDB01_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database IDMR_eMAT_Prod_Polar from server
    S-HDQEMATDB01. (SESSION: 36731)
    08/03/2008 16:01:07 ANE4993E (Session: 36737, Node: S-HDQEPOL02_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database msdb from server S-HDQEPOL02 failed, rc =
    418. (SESSION: 36737)
    08/03/2008 16:01:07 ANE4991I (Session: 36737, Node: S-HDQEPOL02_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database Northwind from server
    S-HDQEPOL02. (SESSION: 36737)
    08/03/2008 16:01:07 ANE4993E (Session: 36690, Node: S-HDQCOGDB01_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database mary from server S-HDQCOGDB01 failed, rc =
    418. (SESSION: 36690)
    08/03/2008 16:01:07 ANE4991I (Session: 36690, Node: S-HDQCOGDB01_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database master from server S-HDQCOGDB01.
    (SESSION: 36690)
    08/03/2008 16:01:07 ANE4993E (Session: 36735, Node: S-HDQMAGIC01_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database msdb from server S-HDQMAGIC01 failed, rc =
    418. (SESSION: 36735)
    08/03/2008 16:01:07 ANE4991I (Session: 36735, Node: S-HDQMAGIC01_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database Northwind from server
    S-HDQMAGIC01. (SESSION: 36735)
    08/03/2008 16:01:08 ANE4993E (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
     
    Last edited: Aug 4, 2008
  2.  
  3. theconqueror

    theconqueror New Member

    Joined:
    Jul 10, 2008
    Messages:
    36
    Likes Received:
    0
    08/03/2008 16:01:08 ANE4993E (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
    ACO3002 Data Protection for SQL: full backup of database
    AtlasCSRpt from server S-HDQSQL04 failed, rc = 418.
    (SESSION: 36722)
    08/03/2008 16:01:08 ANE4993E (Session: 36707, Node: S-HDQSQL01_SQL) TDP MSSQL
    Win32 ACO3002 Data Protection for SQL: full backup of
    database CASS_II from server S-HDQSQL01 failed, rc = 418.
    (SESSION: 36707)
    08/03/2008 16:01:08 ANE4991I (Session: 36707, Node: S-HDQSQL01_SQL) TDP MSSQL
    Win32 ACO3000 Data Protection for SQL: Starting full
    backup of database DocServices from server S-HDQSQL01.
    (SESSION: 36707)
    08/03/2008 16:01:08 ANE4993E (Session: 36724, Node: S-HDQCGOSPT06_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database master from server S-HDQCGOSPT06 failed, rc =
    418. (SESSION: 36724)
    08/03/2008 16:01:08 ANE4991I (Session: 36724, Node: S-HDQCGOSPT06_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database Metrics from server
    S-HDQCGOSPT06. (SESSION: 36724)
    08/03/2008 16:01:08 ANE4993E (Session: 36719, Node: S-HDQCOGDB02_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database contributor from server S-HDQCOGDB02 failed,
    rc = 418. (SESSION: 36719)
    08/03/2008 16:01:08 ANE4991I (Session: 36719, Node: S-HDQCOGDB02_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database master from server S-HDQCOGDB02.
    (SESSION: 36719)
    08/03/2008 16:01:08 ANE4993E (Session: 36733, Node: S-HDQFOQA01_SQL) TDP
    MSSQL Win32 ACO3002 Data Protection for SQL: full backup
    of database master from server S-HDQFOQA01 failed, rc =
    418. (SESSION: 36733)
    08/03/2008 16:01:08 ANE4991I (Session: 36733, Node: S-HDQFOQA01_SQL) TDP
    MSSQL Win32 ACO3000 Data Protection for SQL: Starting
    full backup of database model from server S-HDQFOQA01.
    (SESSION: 36733)
    08/03/2008 16:01:08 ANE4993E (Session: 36726, Node: S-HDQEDW01_SQL) TDP MSSQL
    Win32 ACO3002 Data Protection for SQL: full backup of
    database EDWDimensions from server S-HDQEDW01 failed, rc
    = 418. (SESSION: 36726)
    08/03/2008 16:01:08 ANE4991I (Session: 36726, Node: S-HDQEDW01_SQL) TDP MSSQL
    Win32 ACO3000 Data Protection for SQL: Starting full
    backup of database ERP from server S-HDQEDW01. (SESSION:
    36726)
    08/03/2008 16:01:08 ANE4991I (Session: 36722, Node: S-HDQSQL04_SQL) TDP MSSQL
    ACO3000 Data Protection for SQL: Starting full backup of
    database cgoaud from server S-HDQSQL04. (SESSION: 36722)
    08/03/2008 16:02:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:02:07 ANR1029W Migration process 487 terminated for storage pool
    EXCHDISK - lock conflict. (SESSION: 37286, PROCESS: 487)
    08/03/2008 16:02:07 ANR0985I Process 487 for MIGRATION running in the
    FOREGROUND completed with completion state FAILURE at
    16:02:07. (SESSION: 37286, PROCESS: 487)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR1029W Migration process 487
    (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):terminated for storage pool
    EXCHDISK - (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):lock conflict. (SESSION:
    37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR0985I Process 487 for
    MIGRATION (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):running in the FOREGROUND
    completed (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):with completion state FAILURE
    at (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):16:02:07. (SESSION: 37286)
    08/03/2008 16:02:07 ANR1002I Migration for storage pool EXCHDISK will be
    retried in 60 seconds. (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):ANR1002I Migration for storage
    pool (SESSION: 37286)
    08/03/2008 16:02:07 ANR2753I (MIGRATE_EXCHDISK):EXCHDISK will be retried in 60
    seconds. (SESSION: 37286)
    08/03/2008 16:03:07 ANR1003I Migration retry delay ended; checking migration
    status for storage pool EXCHDISK. (SESSION: 37286)
    08/03/2008 16:03:07 ANR0984I Process 488 for MIGRATION started in the
    FOREGROUND at 16:03:07. (SESSION: 37286, PROCESS: 488)
    08/03/2008 16:03:07 ANR2110I MIGRATE STGPOOL started as process 488. (SESSION:
    37286, PROCESS: 488)
    08/03/2008 16:03:07 ANR1000I Migration process 488 started for storage pool
    EXCHDISK manually, highMig=70, lowMig=0, duration=300.
    (SESSION: 37286, PROCESS: 488)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR1003I Migration retry delay
    ended; (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):checking migration status for
    storage (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK)ool EXCHDISK. (SESSION:
    37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR0984I Process 488 for
    MIGRATION (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):started in the FOREGROUND at
    16:03:07. (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR2110I MIGRATE STGPOOL
    started as (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK)rocess 488. (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):ANR1000I Migration process 488
    started (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):for storage pool EXCHDISK
    manually, (SESSION: 37286)
    08/03/2008 16:03:07 ANR2753I (MIGRATE_EXCHDISK):highMig=70, lowMig=0,
    duration=300. (SESSION: 37286)
    08/03/2008 16:04:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:04:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:04:07 ANR0444W Protocol error on session 37276 for node
    S-HDQEXGTY01 (WinNT) - out-of-sequence verb (type Data)
    received. (SESSION: 37276)
    08/03/2008 16:04:07 ANR0484W Session 37276 for node S-HDQEXGTY01 (WinNT)
    terminated - protocol violation detected. (SESSION:
    37276)
    08/03/2008 16:04:22 ANR0406I Session 37413 started for node S-HDQEXGTY01
    (WinNT) (Tcp/Ip s-hdqexgty01.aawh.atlasair.com(14867)).
    (SESSION: 37413)
    08/03/2008 16:05:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:05:07 ANE4021E (Session: 36550, Node: S-HDQCGOSPT2) Error
    processing '\\s-hdqcgospt2\c$\adsm.sys\W2KReg\*': file
    system not ready (SESSION: 36550)
    08/03/2008 16:05:56 ANR0482W Session 36616 for node S-HDQAIMSWEB06 (WinNT)
    terminated - idle for more than 300 minutes. (SESSION:
    36616)
    08/03/2008 16:07:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:07:07 ANE4021E (Session: 36612, Node: SHDQCOG01) Error
    processing '\\s-hdqcog01\c$\ADSM.SYS\WMI\WMIDBFILE': file
    system not ready (SESSION: 36612)
    08/03/2008 16:07:07 ANR2121W ATTENTION: More than 1088 MB of the database has
    changed and the last database backup was more than 24
    hours ago. Use the BACKUP DB command to provide for
    database recovery.
    08/03/2008 16:13:07 ANR0538I A resource waiter has been aborted.
    08/03/2008 16:13:53 ANR0406I Session 37414 started for node S-HDQBAKUP01
    (WinNT) (Tcp/Ip S-HDQBAKUP01(2370)). (SESSION: 37414)
    08/03/2008 16:13:53 ANR0403I Session 37414 ended for node S-HDQBAKUP01
    (WinNT). (SESSION: 37414)
    08/03/2008 16:39:31 ANR0406I Session 37415 started for node S-HDQREPL01
    (WinNT) (Tcp/Ip s-hdqrepl01.aawh.atlasair.com(4798)).
    (SESSION: 37415)
    08/03/2008 17:00:01 ANR2561I Schedule prompter contacting S-HDQPS01 (session
     
  4. moon-buddy

    moon-buddy Moderator

    Joined:
    Aug 24, 2005
    Messages:
    6,181
    Likes Received:
    277
    Occupation:
    Electronics Engineer, Security Professional
    Location:
    Somewhere in the US
    Has this happened before? The easy fix: reboot your TSM Server.
     
  5. theconqueror

    theconqueror New Member

    Joined:
    Jul 10, 2008
    Messages:
    36
    Likes Received:
    0
    It hasn't happened before to my knowledge. I rebooted the server this morning after I saw what happened...I'll see if it fixed everything in the morning
     
  6. n9hmg

    n9hmg Senior Member

    Joined:
    Dec 18, 2006
    Messages:
    600
    Likes Received:
    13
    Occupation:
    unix admin
    Location:
    northern front-range Colorado, USA
    It looks like you lost contact with your storage. A SAN problem? the array? HBA?
    What OS?
    Any errors in the system logs?
    Now that you've rebooted, have you done any testing, or are you just going to wait and see if it fails tonight? It'd better be the former.
    So, do your DISK-class volumes varyon? Can you audit your FILE-class volumes - one on each filesystem? Did your disks even become available on reboot? If so, did the filesystems on them mount? I see that the server tried to migrate, so you probably need a migration, so , if the storage seems to be available, can you do migrations? You DON'T want to have to face your boss tomorrow "It failed night before last, and I rebooted, hoping it'd work last night, but that failed too. duuuuuuhhhh.....".
     
  7. theconqueror

    theconqueror New Member

    Joined:
    Jul 10, 2008
    Messages:
    36
    Likes Received:
    0
    Its working just fine now. I fired off a couple manual backups the other day to test it out both before and after the reboot and they worked. I have no idea what happened, I went through a bunch of logs and none of them seemed to show anything. I saved the activity log so I can continue to go through it but it seems like it was just a fluke thing.
     

Share This Page