Confusion over moving data ?

ldmwndletsm

ADSM.ORG Senior Member
Joined
Oct 30, 2019
Messages
232
Reaction score
5
Points
0
PREDATAR Control23

While the subject of moving data from one tape to another has been discussed many times, I'm coming up a bit short here on tracking this information down in regards to a primary pool tape.

[ Question 1 ]
After the move data command has completed for a primary pool tape, I presume the source tape will be returned to scratch, following the reusedelay period for the storage pool. Is that right ?

[ Question 2 ]
If there are bad files on the source tape (perhaps marked from audit volume fix=no), and there's a copy then I would expect that TSM will only move (logically) the good files. But what happens then to the source tape once the move is done ? Is TSM still tracking those bad files on that source tape ? If so then how can TSM return that tape to scratch, even after the resusedelay?

[ Question 3 ]
Or will TSM not return it to scratch, in which case you'd have to run audit with fix=yes before running move data ? What if you didn't run it first ?

I ask this because I thought that if you have both the primary and a copy, and TSM marked bad files on the primary (for example, from an audit with fix=no) then 1. audit with fix=yes won't be any different than fix=no due to the copy as opposed to if there was no copy, in which case 2. it deletes that information from its database (no point in keeping it if they're in fact bad), and the affected data would be rebacked up on the next backup of the client, assuming it still exists on the client, of course.

I might have misunderstood there, however, on point 1 of 2 (here in question 3).

[ Question 4 ]
Following along from question 2, what if you then run a restore volume to restore those damaged files. Will that work if you did not first run audit with fix=yes ? If so, what then becomes of those files on that source tape ?

I would expect that maybe TSM would simply no longer be tracking them on that original tape (due to the restore) and would now instead be tracking them on whatever primary pool volume the files were recopied to from the copy pool. And then the original tape would still get returned to scratch despite the bad files not having been deleted with fix=yes ?

I guess what I'm really asking here is if there's a difference between:
a. audit volume fix=no
b. move data
c. restore volume

vesus:

a. audit volume fix=yes
b. move data
c. restore volume
 
PREDATAR Control23

Hi,

I will try to answer your questions.

Q1:
The tapes will be returned to scratch after reuse delay has passed. All media history (error counts and mount counts) are reset to zero. And the tape will be reused. It should be set to private and not reused.

Q2:
TSM will only move the files it can read. If there are bad files on the tape, it will not be moved to scratch before either del vol NAME discard=yes or audit vol fix=yes. After moving all readable files, update volume to 'destroyed' and then run restore vol or restore stg. All the restored files will be removed from the bad tape. You should see that q contents before/after restore is different (hopefully zero files left).
I check the contents output after all good data has been moved out and restored. Evaluate the list. Maybe the data is not that important, maybe it can be ingested again or maybe it can be deleted.

After del vol/audit fix=yes the tape will go to scratch. It should be removed and discared.

Audit vol fix=yes should only be the last task. (Move data, restore vol/stg, ingest data again....)

Q3:
A tape with data will never be returned to scratch. You have to forcefully delete the content.

Q4:
If you run audit fix=yes, I do not think you can recover the data that was marked as bad. Not even from copy pool.

I hope this helps you out.
 
PREDATAR Control23

Okay, thank you very much, Trident. That sheds a lot more light on the issue.

Now, let's say that you do want to return the tape to scratch and reuse maybe at least one more time and then physically discard only if more errors show up again on that same volume in the future. So would you still:

move data
update volume "destroyed"
restore volume
audit vol fix=yes

Or would the steps be different ?

I note that the IBM documentation for restore volume states:

This command changes the access mode of the specified volumes to DESTROYED. When all files on a volume are restored to other locations, the destroyed volume is empty and is deleted from the database.

Hmm ...

Q4:
If you run audit fix=yes, I do not think you can recover the data that was marked as bad. Not even from copy pool.

Well, that's a bit of an "unknown" to me because the TSM documentation (https://www.ibm.com/docs/en/spectru...rify-database-information-storage-pool-volume) says thus for the Primary Store Pool:

  • If the physical file is not a cached copy, and the file is also stored in one or more copy storage pools, the error will be reported and the physical file marked as damaged in the database. You can restore the physical file by using the RESTORE VOLUME or RESTORE STGPOOL command.
  • If the physical file is not a cached copy, and the physical file is not stored in a copy storage pool, each logical file for which inconsistencies are detected are deleted from the database.

So it sounds like they're saying that as long as there is a copy then it can be restored.

But then they say this under Copy Storage Pool:

Fix=Yes The server deletes any references to the physical file and any database records that point to a physical file that does not exist.

It's not clear if that would be limited to the copy or would also apply to the primary.
 
PREDATAR Control23

Now, let's say that you do want to return the tape to scratch and reuse maybe at least one more time and then physically discard only if more errors show up again on that same volume in the future. So would you still:

What I normally do to return the tape to a scratch state after verifying all data has been moved is to re-label it. This should reset the tape to zero state.
 
PREDATAR Control23

Hi,

Some reading always help:


If you run audit volume fix=yes on a volume that is damaged:
- Files that exist only on primary pool get deleted from the DB and cannot be recovered any more
- Files that exit on copy pools gets marked as damaged and can be restored.

The funny parts starts when you are not sure if a file is on a copy pool or not....

And, if you have a unstable part in yourinfrastructure (bad sfps, bad fibres or bad drive), a volume may appear bad, but the errors are really somewhere else.

I still stay audit vol fix=yes is the last option to use.

So the recover would still be:

Move all readable data (maybe using different tape drives) multiple times
Optional audit vol fix=no to give it another try
Optional move data again
Upd vol to destroyed
Get a list from show damaged
Get a list from q cont
Restore stg
See what is left/missing
If possible get clients to ingest data again (only applies to active file data)
If possible get DB to make a new full backup
When all is done
audit vol fix=yes or del vol discard=yes
My selection: Ditch that tape and get a new one.
 
Top