Questions on move data versus restore volume?

ldmwndletsm

ADSM.ORG Senior Member
Joined
Oct 30, 2019
Messages
232
Reaction score
5
Points
0
I'm very confused here about which command is used when and/or if there are occasions where it's recommended to run both.

[ Question 1 of 4 ]
If you have a primary pool tape that you're a little nervous about (maybe you've seen some errors), and "query content copied=no" reports nothing, and you have a copy pool, then what is the preferred approach?

If you run audit volume fix=no, and no discrepancies are reported, then should you simply:

1. Run a "move data" to copy the data to another tape, and keep running it until "query content" on the original tape reports nothing?

Note: we collocate primary pool tapes by groups of nodes. We do no collocate copy pools tapes. Thought I'd read that when moving data where collocation is used that it only moves data for that node (or group?) so t he move data might require multiple runs?

OR

2. Run a "restore volume" with preview=yes? Then load all the off-site copy pool tapes required, set read-only, and then rerun "restore volume" without preview option?

[ Question 2 of 4 ]
Are there cases wherein you should do both a "move data" and a "restore volume"?

Looks like someone did that a while back on a tape wherein they did this:

update volume ABC123L6 access=readonly
move data ABC123L6
update volume ABC123L6 access=destroyed
restore volume ABC123L6

But I don't know what if any errors were reported on the original tape or if an audit volume was run. Regardless, if there's a copy in a copy pool then this seems like it would potentially result in 3 copies of the data. 1. The original data on the copy pool, 2. the "new" tape that the data gets copied to with the "move data" command and 3. the "new" tape that gets created from the "restore volume".

[ Question 3 of 4 ]
Do I have that right about that resulting in 3 copies?

But then again, maybe running a "move data" is a precaution so in a worst case scenario, you now at least have a "new" copy with whatever you can salvage from the original "suspect" tape before you move on to the "restore volume", and that way if that goes awry, at least you still have something? I'm thinking, though, that you probably would not use the "move data" unless the audit reported inconsistencies in which case you'd run it again with "fix=yes" and then move the data? Completely confused or probably wrong here.

[ Question 4 of 4 ]
Does the "restore volume" use the same tape that the "move data" used, or does each of those use a new "scratch" tape?
 
I could be wrong, but I think you are copying the purpose of the 2 commands. I can't think of a scenario where you would pick one over the other. So your questions are kind of confusing the various use cases for these commands. So I'm going to explain some stuff (apologies if you already know this), and maybe you can ask different questions after if you still have some.

MOVE VOLUME is used to move data to different volumes, it can only move good files, not damaged files. So if all files are moved successfully, the original volume returns to scratch, it not it remains there with just the damaged files left on it.
RESTORE VOLUME is used to recover damaged data on a volume by restoring it from copy pool volumes, the original volume returns to scratch after the restore, unless it was physically lost.

Which volume the data is moved or restored to depends on collocation or not:
Selecting volumes with collocation enabled
Selecting volumes with collocation disabled


Now, in the case of damaged data on a volume, you would do 3 things in the following order:
  1. audit the volume in question
  2. do a move data to move all the good data on that volume to other volume in the same pool, the reason for that is that when you do the restore, there will only be the damaged data to restore, so you'll need to bring back less copypool tapes, and the restore will be quicker
  3. restore the volume with damaged data: https://www.ibm.com/docs/en/tsmhfw/...m.ibm.itsm.srv.doc/t_recover_stgpool_vol.html
 
Okay, thank you for that explanation. That does help to iron things out. I hadn't thought about the benefit to step 2 being fewer off-site tapes to return. Makes sense. But "mechanically", if you left out step 2, it could still work, but you'd just have to bring more tapes back, and read through more, and it would take longer. That right?

And I think where I was getting confused, in regards to "3 copies", is that I was thinking that the "move" would move all the data (copy 1), and then the "restore" would bring it all back (copy 2), and then there's the already existing copy on the copy pool (copy 3). BUT it sounds like the "move" is only going to move the non-damaged data, and the restore is only going to restore the damaged data, so that will now be copy 1, and the copy pool is copy 2, so you will still have two copies when all is said and done. Do I have that correct?

o Should the audit be run with "fix=yes" ?

o Does it hurt to run "update volume volume_name access=destroyed" before step 3?

The document mentions that this is done automatically during the restore, but I've seen it mentioned in some of documents as an optional step. Just curious if it really matters?
 
But "mechanically", if you left out step 2, it could still work, but you'd just have to bring more tapes back, and read through more, and it would take longer. That right?
Yes, the sole purpose of the MOVE DATA is to reduce the amount of data to restore.

is only going to move the non-damaged data, and the restore is only going to restore the damaged data,
Actually, when you do a restore volume, it restores everything that was on that volume onto other volume(s). So if you did a move volume before, all that's left is damaged data.

so that will now be copy 1, and the copy pool is copy 2, so you will still have two copies when all is said and done. Do I have that correct?
If by copy1, you mean the primary copy, that's correct.

o Should the audit be run with "fix=yes" ?
It depends. FIX=NO marks damaged objects as damaged, but keeps them. FIX=YES marks them damaged, keeps them if there's a copy in the copypool to restore, or deletes it if there's no copy. No point to keep a damaged version if there's no copy to restore from. Now, there are cases where the data is not really damaged, but instead it's I/O errors that give the impression it's damaged. In those cases, FIX=NO is preferable. Personally, I'd only use FIX=YES when it's guaranteed that the data is damaged and there's no way to recover it.


o Does it hurt to run "update volume volume_name access=destroyed" before step 3?
It doesn't matter. It's a bit redundant since the restore is doing it.
 
Actually, when you do a restore volume, it restores everything that was on that volume onto other volume(s). So if you did a move volume before, all that's left is damaged data.
So if there were damaged files, and you didn't run the "move" first to salvage all the non-damaged files, then the restore would bring back everything, including the damaged files, assuming you had a copy? Or it would ONLY bring back damaged files?

If by copy1, you mean the primary copy, that's correct.
Yes, I wasn't so clear there, but that's what I meant.

Personally, I'd only use FIX=YES when it's guaranteed that the data is damaged and there's no way to recover it.
So if you ran an "audit with fix=no" first, and it reports no damaged files then a subsequent move would presumable copy all of it, but a restore would do nothing?

And if the "audit with fix=no" does report damaged files, and there is no copy, then would it be necessary to run it again with "fix=yes" in order to force TSM to re-backup any files that *still* exist on disk? Or would that still happen automatically if there is no copy, since they were marked damaged?

And if the "audit with fix=no" does report damaged files, and there is a copy then should the steps be:

1. Run audit with fix=yes
2. Move
3. Restore
 
So if there were damaged files, and you didn't run the "move" first to salvage all the non-damaged files, then the restore would bring back everything, including the damaged files, assuming you had a copy? Or it would ONLY bring back damaged files?
Everything. What happens with a restore volume is that it identifies all the files on the volume to restore, it grabs all those files and puts them on other volumes in the same storage pool (private and scratch), after the restore is completed, the original volume is deleted from the inventory.
So if you ran an "audit with fix=no" first, and it reports no damaged files then a subsequent move would presumable copy all of it, but a restore would do nothing?
Well, if you move everything successfully, the original volume returns to scratch. You can't restore a scratch tape, it's empty.
And if the "audit with fix=no" does report damaged files, and there is no copy, then would it be necessary to run it again with "fix=yes" in order to force TSM to re-backup any files that *still* exist on disk?
Yes, you would need to do fix=yes.
And if the "audit with fix=no" does report damaged files, and there is a copy then should the steps be:

1. Run audit with fix=yes
2. Move
3. Restore
Yes
 
Thanks much there, Marclant. That was very helpful and makes things much clearer now. :)

Well, if you move everything successfully, the original volume returns to scratch. You can't restore a scratch tape, it's empty.

I wasn't clear there. Let me rephrase. If you ran an "audit with fix=no" first, and it reports no damaged files, then would either a "move data" or a "restore volume" (not both, just one or the other) accomplish the same thing (never mind the added burden of having to bring tapes back from off site, and the additional tapes, if you chose the later)? Is there any difference in the end result?

I'm thinking a case wherein you've seen a lot of errors on a tape, but an audit reports no problems, so you proactively move that data to another volume, the new volume reports all the data (q content and such), the old volume subsequently reports nothing (all data moved) and you then take the old volume out of circulation. I guess a move would be preferred since it eliminates the need to return copy pool tapes, so you would at least attempt the "move" first and only proceed to the "restore" if necessary?
 
I wasn't clear there. Let me rephrase. If you ran an "audit with fix=no" first, and it reports no damaged files, then would either a "move data" or a "restore volume" (not both, just one or the other) accomplish the same thing (never mind the added burden of having to bring tapes back from off site, and the additional tapes, if you chose the later)? Is there any difference in the end result?
Correct, if there's no damaged files, then a move would move everything, so there's no benefit to do a restore which would give the same end result, but like you said with a lot more work.

I guess a move would be preferred since it eliminates the need to return copy pool tapes, so you would at least attempt the "move" first and only proceed to the "restore" if necessary?
The only use case for a restore is to recover data that is no longer available in the primary pool because it is damaged or the tape is missing.
so you proactively move that data to another volume, the new volume reports all the data
The data you move may or may not be on a single volume, it might be on multiple volumes. Server always starts to write data on a filling volume first, when it's full, continues on a scratch. There's other factors, more info here:
Selecting volumes with collocation enabled
Selecting volumes with collocation disabled
 
The data you move may or may not be on a single volume, it might be on multiple volumes. Server always starts to write data on a filling volume first, when it's full, continues on a scratch.

Sure. But doing a "move data volume_B" should take care of also moving the files spanning in and out of B, right? So in that case, if that requires three tapes (one or more files spanning from volume A, and one or more spanning to volume C), and all three tapes are in the tape library, then the move should take care of all that, and the database will then be updated to no longer be tracking any of that data on A, B, or C, and instead now only on whatever the new volume is that all that was moved to (which itself could also possibly span to a second tape), correct?

And if you've seen a lot of errors on a tape, but an audit reports no problems (fix=no), then it's still recommended to "move" the data? And doing this eliminates the need to return copy pool tapes, but you would at least attempt the "move" first and only proceed to the "restore", if necessary, i.e. damaged files?
 
Sure. But doing a "move data volume_B" should take care of also moving the files spanning in and out of B, right? So in that case, if that requires three tapes (one or more files spanning from volume A, and one or more spanning to volume C), and all three tapes are in the tape library, then the move should take care of all that, and the database will then be updated to no longer be tracking any of that data on A, B, or C, and instead now only on whatever the new volume is that all that was moved to (which itself could also possibly span to a second tape), correct?
Yes it will move everything that is on volume_B to different volume(s) in the same pool, but that's it.

And if you've seen a lot of errors on a tape, but an audit reports no problems (fix=no), then it's still recommended to "move" the data? And doing this eliminates the need to return copy pool tapes, but you would at least attempt the "move" first and only proceed to the "restore", if necessary, i.e. damaged files?
That's a personal call. Also depends on the I/O errors, if they are relating to the media, yes that would be a good idea, but if it's related to hardware, then there's really no point, address the hardware issue instead.
 
Yes it will move everything that is on volume_B to different volume(s) in the same pool, but that's it.

Can you clarify what you mean by "but that's it"? Are you saying that it will not also move the spanning files (from/to other volumes)? If that's the case then would that not create dangling files (pieces/parts) once the "suspect" tape's data (i.e. the data that is completely contained on there) has been moved?

Or are you simply stating that, yes, it will also move the spanning files, but there's no other additional rewards beyond that, so end of story?

I ask this because we did a move on a volume recently (didn't have much data) as a test following a reconfiguration of the tape drive devices, and it first loaded another volume, read for a minute or so and then loaded the tape we were doing the move on and continued for a while and then completed. My assumption was that there were one or more files that spanned from that other tape.
 
Can you clarify what you mean by "but that's it"? Are you saying that it will not also move the spanning files (from/to other volumes)? If that's the case then would that not create dangling files (pieces/parts) once the "suspect" tape's data (i.e. the data that is completely contained on there) has been moved?
If there's a spanning file, it is my understanding (but I could be wrong) that it will move the portion that is on that tape, but the other portion that resides on another tape will remain on that tape. So if a client does a restore of that file, it will need to mount both tapes to restore it after the move, just like it did before the move.

I ask this because we did a move on a volume recently (didn't have much data) as a test following a reconfiguration of the tape drive devices, and it first loaded another volume, read for a minute or so and then loaded the tape we were doing the move on and continued for a while and then completed. My assumption was that there were one or more files that spanned from that other tape.
Maybe it checked to make sure the other portion of the file was still accessible and valid or maybe it moved it. I guess we'd need to compare Q CONTENT of those tapes before and after.
 
Maybe it checked to make sure the other portion of the file was still accessible and valid or maybe it moved it.
Ah, yes, checking to see if the preceding pieces of the affected file(s) is still accessible would make perfect sense. And likewise, it would follow that it would want to do the same for any trailing pieces that continue onto (span to) another tape. But as you suggested, we'd need to test this to be sure.

I guess we'd need to compare Q CONTENT of those tapes before and after.
Yes, that's an excellent point and would answer that. Unfortunately, we did not contrive to run a "q content" on those two tapes before doing the move, let alone after, and the volume that was moved was, of course, deleted after the storage pool reusedelay expired, and the tape has since been reused. The other tape had some reclamation done later. It would not be possible to determine anything at this point, even if there was any hope of retroactively deducing anything.

My experience with another backup product was that it ALWAYS worked in whole files, so if, for example, the data on tape A is cloned (copied) to another tape, it MUST read all the bits on all the files on tape A, including any preceding or trailing pieces on other tapes to create the "new" copy, so this could result minimally in several tapes being read. It was NOT possible to only read and copy solely the bits on tape A. That would have been nice as it would have reduced the time. BUT obviously a different product, so we can't expect the same results or features, natch. Moreover, we don't know that TSM is not doing the same thing at least in terms of reading those other pieces, but it remains to be seen whether it's actually moving them or not.

Anyway, since I've always noticed that "audit volume" usually (not always) requires a preceding, and/or a trailing tape (2-3 tapes total, including the volume that's being audited), I guess I just inferred that a "move data volume" would function the same wherein it would move those spanning pieces, just like the audit reads them, wherein those pieces would no longer be reported if you subsequently ran a "q content" on those other tapes once the move completed. But, of course, that doesn't meant that TSM is not reading those other pieces in their entirety. Yep, we should actually test this at some point to categorically answer this.

BTW: If in fact TSM is actually moving those spanning pieces, and given that the "move data volume" command does not have a preview option, then I guess you just need to check the activity log to see what other tapes are required and then hope that you can complete a "q content" on all of them before it completes the move?

I'm thinking that should be doable as long as the queries are run as soon as possible, particularly if there's not much data on the volume that's being moved.

And does "move data" require that all the required tapes be readwrite?
 
Back
Top