1. Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING) Click the link to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This message will disappear after you have made at least 12 posts. Thank you for your cooperation.

ANR9999D running RESTORE STORAGE POOL

Discussion in 'Restore / Recovery Discussion' started by Fruitysoup, Oct 24, 2012.

  1. Fruitysoup

    Fruitysoup New Member

    Joined:
    Sep 15, 2009
    Messages:
    5
    Likes Received:
    0
    Hi All

    I've got an unsettling feeling about this, but when I ran a restore storage pool against our main backup pool, I got (full error in attached file View attachment FullError.txt ) :

    10/24/2012 11:05:35 ANR0984I Process 297 for RESTORE STORAGE POOL (PREVIEW)
    started in the FOREGROUND at 11:05:35 AM. (SESSION: 5835,
    PROCESS: 297)
    10/24/2012 11:05:35 ANR1231I Restore preview of primary storage pool FILE_POOL
    started as process 297. (SESSION: 5835, PROCESS: 297)
    10/24/2012 11:05:35 ANR9999D_3853744942 HandlePrimaryFile(afrestor.c:2045)
    Thread<54656>: Error 145 getting volume name for volId

    57681. (SESSION: 5835)

    Some background : About a month ago I ran out of server space because of a Domino node that wasn't expiring transaction logs correctly. I deleted a couple of filespaces from clients that weren't being backed up any more in order to free some space in the FILE_POOL, that was fine, and the server started backing up again. I eventually resolved the transaction log expiry issue and gained around 700GB of storage space, so the system seemed quite happy.
    As a matter of course, I regularly backup the primary pool to tape using BACKUP STGPOOL. I noticed after running out of storage space the backup was failing so ran an AUDIT VOL FIX=yes against the FILE_POOL and hoped to repair the erroneous volumes that were causing the backup to fail. However, trying to do a RESTORE STGPOOL which 'Succeeded' without listing off the tapes required.


    tsm: SERVER1>restore stgpool file_pool prev=yes wait=yes
    ANR0984I Process 297 for RESTORE STORAGE POOL (PREVIEW) started in the FOREGROUND at 11:05:35 AM.
    ANR1231I Restore preview of primary storage pool FILE_POOL started as process 297.
    ANR1234I Restore process 297 ended for storage pool FILE_POOL.
    ANR0985I Process 297 for RESTORE STORAGE POOL (PREVIEW) running in the FOREGROUND completed with completion state SUCCESS at 11:05:49 AM.
    ANR1239I Restore preview of primary storage pool FILE_POOL has ended. Files Restored: 0, Bytes Restored: 0.

    We run dedup on this pool, and calling

    db2 "select volname from tsmdb1.ss_volume_ids where volid in ( select volid from tsmdb1.af_segments where srvid=0 and poolid=6 and bfid in ( select superbfid from tsmdb1.bf_aggregated_bitfiles where srvid=0 and offset=0 and length=0 ) )" |sort |cub -b1-30 | uniq

    gave a list of 166 volumes with damage.

    Now, my question is, is the pool totally knackered ? Do I delete all volumes in FILE_POOL and start again, or is there something I can do to perhaps ? Any suggestions appreciated



    FYI :
    Session established with server SERVER1: Linux/x86_64
    Server Version 6, Release 2, Level 2.0

    View attachment file_pool.txt
     
  2.  
  3. chad_small

    chad_small Moderator

    Joined:
    Dec 17, 2002
    Messages:
    2,197
    Likes Received:
    43
    Occupation:
    AIX/SAN/TSM
    Location:
    Gilbert, AZ
    So what was the result of the AUDIT VOL FIX=YES? Did it say it fixed the volumes? Is the file pool a disk storage pool? If so clean them out, delete the volumes, and then recreate them and that should fix it. If they are a file devclass then use MOVE DATA then delete the FILE volumes. I've have some similar issues in days past and the deleting the disk pool volumes and recreating them fixed my problem.
     
  4. Fruitysoup

    Fruitysoup New Member

    Joined:
    Sep 15, 2009
    Messages:
    5
    Likes Received:
    0
    Hi Chad

    Unfortunately it didn't do a lot of good. I ran a series of 'move data' and 'del vol' to get the good files out of the file pool and deleted the resulting bad volumes. To get aroudn the "Error 145", I did some scary database modifications that allowed me to get at the remaining volumes and delete them. The process involved running :

    restore stgpool file_pool prev=yes wait=yes

    and getting the volid from the error, the in a db2 session :


    insert into SS_VOLUME_IDS ( volid,volname, poolid, strategy, update_date ) values( XXXXX, '/tmp/voltest' , 6, 30, SYSDATE )

    where XXXXX is the volume id it was complaining about, and '/tmp/voltest' didn't have to exist. I was then able to run

    del vol /tmp/voltest discard=yes

    which cleared out the database of all the stale records pertaining to the damaged vol, although it didn't remove the volume itself and so I had to do that manually in a db2 session with :

    delete from SS_VOLUME_IDS where poolid=6 and volname= '/tmp/voltest' and volid=XXXXX

    It was all a bit hairy, but managed to get to a position where the restore stgpool would not crash out. This morning I tried to do a backup stgpool after all the changes and i'm still getting volumes damaged dedup bitfiles and ANR4895E. I'll try a couple more things but if it's still not good I'll do like you said and delete all the data in the pool and start from scratch.

    thanks for the help
     

Share This Page