test restore from copy pool fails... but why?

foobar2devnull

ADSM.ORG Member
Joined
Nov 30, 2010
Messages
122
Reaction score
1
Points
0
Location
Belgium
Hi all,

I tried to do a test restore that simulated the failure of the primary pool and hoped it would switch to a copy pool.

The following is my current setup.

Code:
nodes --> PrimaryDisk OS (PD01) +-> Primary File (PF01)
                                |
                                +-> Copy Tape (CT01)
                                |
                                +-> Copy Tape (CT02) off-site

I did the following steps:
* backup the node
* backup PD01 to CT02,CT01
* migrate PD01 to PF01

My test:
* restoring a file was successful through PF01
Code:
Restoring       2,155,191 /home/jdoe/sample.tar.gz --> /tmp/sample.tar.gz [Done]

I then deleted all PF01 volumes and ran the restore again.

I got the following:
Code:
ANS1302E No objects on server match query

I thought it would automatically search for the file on a copy pool when not found in the primary pool. I suspect I did something wrong but what?

Can you advise?

Thanks!
 
Sorry for the latre reply, we're in different time zones ;)

moon-buddy, the answer to your question is "deleted". I ran a "delete vol" on all the volumes in the PF01.

chad_small, both copy pools are onsite. I have not setup the DRM yet.

>select STGPOOL_NAME,POOLTYPE,ACCESS from stgpools where STGPOOL_NAME like 'CT0%'

STGPOOL_NAME POOLTYPE ACCESS
-------------------------------- -------------------------------- ---------------------------------------------------------------------------------------------------------------------------------
CT01 COPY READWRITE
CT02 COPY READWRITE


Thank you both for your help. I'll keep looking.
 
Last edited:
Yep - as what Jeroen said - deleting the volumes with 'discarddata=yes' deletes all copies associated with that volume. The proper way - and this is simulating a disaster - is to mark the onsite copy pool as destroyed.

I hope you have the DB backup prior to doing all of these activities. You will need to restore that DB backup to get back all of your data.
 
Last edited:
Thank you all for your help. The test was made on a new server with my data alone so no loss there but rather a good lesson learned! :)
 
I have a similar issue except it's a corrupted volume that I am trying to REMOVE from the stgpool it's in. I don't want to delete the data registered on the volume anywhere else, so how do I just get rid of the volume from the stagepool? I have already removed the VOL from the library and yet it remains listed in the stgpool.
Ideas?
 
I marked it as destroyed, recovered the data from the recovery volumes, checked out the volume from the library and ejected the tape. But now it is still showing in the stgpool. What now?
 
Last edited:
I marked it as destroyed, recovered the data from the recovery volumes, checked out the volume form the librarry and ejected the tape. But now it is still showing in the stgpool. What now?

I'm no veteran and I'm not in front of my console but does the CHECKOUT command not put the tape into the I/O and remove the entry from the library list. What command do you run to see it is still listed?

Code:
Q LIBVOL <tape>

Maybe an audit of the library followed by one of TSM might be in order. Any one else have advice on this?
 
I want to remove the entry from the STGPOOL. It's already removed from the Library:
tsm: WP1INST1>q libvol ibm3494-02 A00870 f=d
ANR2034E QUERY LIBVOLUME: No match found using this criteria.
ANS8001I Return code 11.
But it still shows the volume is IN the Primary Stagepool I am trying to clear:
tsm: WP1INST1>q vol stgpool=ibmpool
Volume Name Storage Device Estimated Pct Volume
Pool Name Class Name Capacity Util Status
------------------------ ----------- ---------- --------- ----- --------
A00870 IBMPOOL IBM3590 95.2 G 64.3 Full
But the volume shows Destroyed:

tsm: WP1INST1>q vol A00870 f=d
Volume Name: A00870
Storage Pool Name: IBMPOOL
Device Class Name: IBM3590
Estimated Capacity: 95.2 G
Scaled Capacity Applied:
Pct Util: 64.3
Volume Status: Full
Access: Destroyed
Pct. Reclaimable Space: 35.7
Scratch Volume?: Yes
In Error State?: No
Number of Writable Sides: 1
Number of Times Mounted: 203
Write Pass Number: 1
Approx. Date Last Written: 09/22/06 22:12:29
Approx. Date Last Read: 03/11/13 17:55:24
Date Became Pending:
Number of Write Errors: 0
Number of Read Errors: 1
Volume Location:
Volume is MVS Lanfree Capable : No
Last Update by (administrator): ADMIN
Last Update Date/Time: 03/18/13 09:46:03
Begin Reclaim Period:
End Reclaim Period:
Drive Encryption Key Manager:
Logical Block Protected: No

I cannot seem to get the entry for the tape out, even though it is physically removed from the 3494, because:

tsm: WP1INST1>DEL VOL A00870
ANR2220W This command will delete volume A00870 from its storage pool after verifying that the volume contains no data.
Do you wish to proceed? (Yes (Y)/No (N)) y
ANR2406E DELETE VOLUME: Volume A00870 still contains data.
ANS8001I Return code 13.
Seems to be a loxadrome!
 
Last edited:
I believe that the order in which you do your steps is important.

Did you mark the tape as destroyed before or after checking it out?

Did your 'restore volume' command work? It complains there is still data on the tape so I wonder...

Maybe you should check it back in, see if a 'q libvol' will work and then check it out again before attempting another 'del vol'
 
Ok, so my first question would be, do you have a copy pool?

1) If not, Step '4' onwards are of no use to you and you should probably put the tape back into the library, mark it 'readonly' and try and move the data to another tape (Steps 1-3).

2) If you do have a copy pool, it looks like the 'restore volume' command might have failed. If unable to rebuild the volume, you won't be able to delete it as it still holds valid data. Try steps 4 onwards or follow the IBM document I added to my initial post.

Best of luck.
 
Ok, so my first question would be, do you have a copy pool?

Yes.
1) If not, Step '4' onwards are of no use to you and you should probably put the tape back into the library, mark it 'readonly' and try and move the data to another tape (Steps 1-3).
Tape is corrupt anyway, cannot mv data from it.

2) If you do have a copy pool, it looks like the 'restore volume' command might have failed. If unable to rebuild the volume, you won't be able to delete it as it still holds valid data. Try steps 4 onwards or follow the IBM document I added to my initial post.

This is apparently what TSM S/W support is trying to establish. I am re-running the "Restore Vol" tasks. Will apprise what the results are... initially it appears TSM is not tracking data location correctly. Hopefully this is NOT the case.
I moved this data sucessfully tweice now and yet Query Content still "shows" data for this destroyed 3590 volume.

Best of luck.
Thanks. :)

-Patrick
 
For what it's worth

I had to recover from a defect tape just yesterday so I thought the output of a successful volume restore might help you identify where it went wrong for you.

Code:
04/08/2013 11:51:01      ANR2017I Administrator JDOE issued command: RESTORE                          
                          VOLUME AA0049L5 MAXPROCESS=2  (SESSION: 145076)
04/08/2013 11:51:12      ANR2114I RESTORE VOLUME: Access mode for volume AA0049L5
                          updated to "destroyed". (SESSION: 145076)
04/08/2013 11:51:12      ANR0984I Process 324 for RESTORE VOLUME started in the
                          BACKGROUND at 11:51:12 AM. (SESSION: 145076, PROCESS:
                          324)
04/08/2013 11:51:12      ANR2110I RESTORE VOLUME started as process 324. (SESSION:
                          145076, PROCESS: 324)
04/08/2013 11:51:12      ANR0984I Process 325 for RESTORE VOLUME started in the
                          BACKGROUND at 11:51:12 AM. (SESSION: 145076, PROCESS:
                          325)
04/08/2013 11:51:12      ANR2110I RESTORE VOLUME started as process 325. (SESSION:
                          145076, PROCESS: 325)


[...]


04/08/2013 16:59:04      ANR0986I Process 325 for RESTORE VOLUME running in the
                          BACKGROUND processed 1,126,587 items for a total of
                          1,609,635,922,941 bytes with a completion state of
                          SUCCESS at 04:59:04 PM. (SESSION: 145076, PROCESS: 325)
04/08/2013 17:46:13      ANR0986I Process 324 for RESTORE VOLUME running in the
                          BACKGROUND processed 3,806,153 items for a total of
                          1,285,128,362,038 bytes with a completion state of
                          SUCCESS at 05:46:13 PM. (SESSION: 145076, PROCESS: 324)
04/08/2013 17:46:13      ANR1240I Restore of volumes in primary storage pool PT01
                          has ended.  Files Restored: 4932740, Bytes Restored:
                          2894764284979, Unreadable Files: 0, Unreadable Bytes: 0.
                          (SESSION: 145076)
04/08/2013 17:46:13      ANR2208I Volume AA0049L5 deleted from storage pool PT01.
                          (SESSION: 145076)
04/08/2013 17:46:13      ANR1341I Scratch volume AA0049L5 has been deleted from
                          storage pool PT01. (SESSION: 145076)

Best of luck
 
It's ANOTHER IT Miracle!

Well now! That worked right this time! :)
04/09/13 19:36:44 ANR0515I Process 705 closed volume A00514. (SESSION: 7474,
PROCESS: 705)
04/09/13 19:36:44 ANR0515I Process 705 closed volume A11395. (SESSION: 7474,
PROCESS: 705)
04/09/13 19:36:44 ANR1235I Restore process 705 ended for volumes in storage
pool IBMPOOL. (SESSION: 7474, PROCESS: 705)
04/09/13 19:36:44 ANR0986I Process 705 for RESTORE VOLUME running in the
BACKGROUND processed 9,529 items for a total of
114,458,786,761 bytes with a completion state of SUCCESS
at 19:36:44. (SESSION: 7474, PROCESS: 705)
04/09/13 19:36:44 ANR1240I Restore of volumes in primary storage pool
IBMPOOL has ended. Files Restored: 9529, Bytes Restored:
114458786761, Unreadable Files: 0, Unreadable Bytes: 0.
(SESSION: 7474)
04/09/13 19:36:44 ANR2208I Volume A00870 deleted from storage pool IBMPOOL.
(SESSION: 7474)
04/09/13 19:36:44 ANR1341I Scratch volume A00870 has been deleted from
storage pool IBMPOOL. (SESSION: 7474)
04/09/13 19:37:44 ANR8325I Dismounting volume A11395 - 1 minute mount
retention expired.
04/09/13 19:37:44 ANR8325I Dismounting volume A00514 - 1 minute mount
retention expired.
04/09/13 19:37:46 ANR8336I Verifying label of 3590 volume A11395 in drive
IBM3590H04 (/dev/rmt7). (SESSION: 7474, PROCESS: 705)

I had complained to a senior TSM engineer and suddenly it worked this time (while the last two times failed).
It's an IT MIRACLE! (Yeah right)
Well I gues we are done. Thanks!
 
Last edited:
Back
Top