ldmwndletsm
ADSM.ORG Senior Member
- Joined
- Oct 30, 2019
- Messages
- 232
- Reaction score
- 5
- Points
- 0
Five questions on audit volume with fix=no versus yes.
I don't understand what the real difference between fix=no and fix=yes is and why you use one versus the other? Fix=yes deletes from the database, whereas fix=no marks as deleted, but after running an audit (fix=no), wherein it reported damaged files on two copy pool tapes, the next morning when our admin script (`backup stgpool primary_pool copy_pool`) ran, those files were copied to another copy pool tape and are no longer reported on the original two. NOTE: None of the primary pool tapes was reported with any damaged files.
According to the IBM documentation for 'backup stgpool':
"If a file already exists in the copy storage pool, the file is not backed up unless the copy of the file in the copy storage pool is marked as damaged."
So it looks like this is exactly what it did.
Q 1. So it doesn't appear that it was necessary for me to use fix=yes?
Q 2. Is there any reason for me to do this now, after the fact?
It also says this for the copy storage pool:
" Fix=No The server reports the error and marks the physical file copy as damaged in the database.
Fix=Yes The server deletes any references to the physical file and any database records that point to a physical file that does not exist."
But here's where I get really confused. The documentation for query content says this:
"Damaged: Yes Specifies that only files that are marked as damaged are displayed. These are files in which the server found errors when a user attempted to restore, retrieve, or recall the file, or when an AUDIT VOLUME command was run. "
But when I run `q content volume damaged=yes` on both of the damaged copy pool tapes, it reports thus:
ANR2034E QUERY CONTENT: No match found using this criteria.
ANS8001I Return code 11.
Q 3. If I didn't use 'fix=yes' for the audit then why would it not report the damaged files for these tapes?
Maybe it would have if I'd done it before the admin script (`backup stgpool primary_pool copy_pool`) ran, but once the damaged files are copied to another copy pool tape then it's too late?
Q 4. When audit with fix=no reports damaged files, and you rerun the audit, and no damaged files are found, then the files that were previously marked damaged in the database are now unmarked?
Specifically, I ran another audit (fix=no) on the two tapes, as discussed below, in a different drive, but this was after the admin script ran, and the files inspected were less by the number of damaged files. Would they have been the original values (8003349, 25143762) if the admin script had not run yet, assuming the tapes were okay?
Q 5. If you use fix=no for a copy pool volume, and damaged files are reported then what happens when you run `backup stgpool primary_pool copy_pool`? Does this reset the damaged files when it copies those from the primary to another or "new" copy pool tape?
[ background ]
I ran an audit (fix=no) on a bunch of copy pool tape volumes. Two of them reported damaged files: one with 12 (files inspected=8003349) and the other with 13039052 (files inspected=25143762). Both of these occurred on tape drive 3. The activity log reported some other problems with media in drive 3. I was suspicious of the drive, so the next day, I took that drive offline and reran the audit (fix=no) on both of those volumes using a different drive. It reported thus:
Message: ANR4133I Audit volume process ended for volume B00530L6; 8003337 files inspected, 0 damaged files found and marked as damaged, 0 files previously marked as damaged reset to undamaged, 0 objects need updating. (SESSION: 301242, PROCESS: 3306)
Message: ANR4133I Audit volume process ended for volume B00628L6; 12104715 files inspected, 0 damaged files found and marked as damaged, 5 files previously marked as damaged reset to undamaged, 0 objects need updating. (SESSION: 301943, PROCESS: 3307)
These numbers concur with what I would expect when you subtract the initial files inspected from the damaged files. I also ran 'query content volume' and added up the number of files reported, and these also match these new totals.
The admin script that does the backup stgpool ran before I ran the second audit, so I guess it must have copied the damaged files from the primary pool tapes to the new copy pool tape because when I ran 'show damaged copypool' it reported nothing. Also, when I ran 'query volume damaged=yes', it reported nothing. As a test, I then picked the first and last damaged files from each of the two tapes, and I hunted through the database (using the object_id, show bfo object_id, bfo super-bitfile method) to determine the primary and copy pool volume where these files reside. Both reported the same "new" copy pool tape B00784L6, not B00530L6 or B00628L6. I then changed the access on the primary pool volumes to unavailable (to force TSM to use the copy pool) and ran a restore of these files. Volume B00784L6 was loaded, and the files concur with what's on disk.
Does this seem correct?
I don't understand what the real difference between fix=no and fix=yes is and why you use one versus the other? Fix=yes deletes from the database, whereas fix=no marks as deleted, but after running an audit (fix=no), wherein it reported damaged files on two copy pool tapes, the next morning when our admin script (`backup stgpool primary_pool copy_pool`) ran, those files were copied to another copy pool tape and are no longer reported on the original two. NOTE: None of the primary pool tapes was reported with any damaged files.
According to the IBM documentation for 'backup stgpool':
BACKUP STGPOOL
www.ibm.com
"If a file already exists in the copy storage pool, the file is not backed up unless the copy of the file in the copy storage pool is marked as damaged."
So it looks like this is exactly what it did.
Q 1. So it doesn't appear that it was necessary for me to use fix=yes?
Q 2. Is there any reason for me to do this now, after the fact?
It also says this for the copy storage pool:
" Fix=No The server reports the error and marks the physical file copy as damaged in the database.
Fix=Yes The server deletes any references to the physical file and any database records that point to a physical file that does not exist."
But here's where I get really confused. The documentation for query content says this:
"Damaged: Yes Specifies that only files that are marked as damaged are displayed. These are files in which the server found errors when a user attempted to restore, retrieve, or recall the file, or when an AUDIT VOLUME command was run. "
But when I run `q content volume damaged=yes` on both of the damaged copy pool tapes, it reports thus:
ANR2034E QUERY CONTENT: No match found using this criteria.
ANS8001I Return code 11.
Q 3. If I didn't use 'fix=yes' for the audit then why would it not report the damaged files for these tapes?
Maybe it would have if I'd done it before the admin script (`backup stgpool primary_pool copy_pool`) ran, but once the damaged files are copied to another copy pool tape then it's too late?
Q 4. When audit with fix=no reports damaged files, and you rerun the audit, and no damaged files are found, then the files that were previously marked damaged in the database are now unmarked?
Specifically, I ran another audit (fix=no) on the two tapes, as discussed below, in a different drive, but this was after the admin script ran, and the files inspected were less by the number of damaged files. Would they have been the original values (8003349, 25143762) if the admin script had not run yet, assuming the tapes were okay?
Q 5. If you use fix=no for a copy pool volume, and damaged files are reported then what happens when you run `backup stgpool primary_pool copy_pool`? Does this reset the damaged files when it copies those from the primary to another or "new" copy pool tape?
[ background ]
I ran an audit (fix=no) on a bunch of copy pool tape volumes. Two of them reported damaged files: one with 12 (files inspected=8003349) and the other with 13039052 (files inspected=25143762). Both of these occurred on tape drive 3. The activity log reported some other problems with media in drive 3. I was suspicious of the drive, so the next day, I took that drive offline and reran the audit (fix=no) on both of those volumes using a different drive. It reported thus:
Message: ANR4133I Audit volume process ended for volume B00530L6; 8003337 files inspected, 0 damaged files found and marked as damaged, 0 files previously marked as damaged reset to undamaged, 0 objects need updating. (SESSION: 301242, PROCESS: 3306)
Message: ANR4133I Audit volume process ended for volume B00628L6; 12104715 files inspected, 0 damaged files found and marked as damaged, 5 files previously marked as damaged reset to undamaged, 0 objects need updating. (SESSION: 301943, PROCESS: 3307)
These numbers concur with what I would expect when you subtract the initial files inspected from the damaged files. I also ran 'query content volume' and added up the number of files reported, and these also match these new totals.
The admin script that does the backup stgpool ran before I ran the second audit, so I guess it must have copied the damaged files from the primary pool tapes to the new copy pool tape because when I ran 'show damaged copypool' it reported nothing. Also, when I ran 'query volume damaged=yes', it reported nothing. As a test, I then picked the first and last damaged files from each of the two tapes, and I hunted through the database (using the object_id, show bfo object_id, bfo super-bitfile method) to determine the primary and copy pool volume where these files reside. Both reported the same "new" copy pool tape B00784L6, not B00530L6 or B00628L6. I then changed the access on the primary pool volumes to unavailable (to force TSM to use the copy pool) and ran a restore of these files. Volume B00784L6 was loaded, and the files concur with what's on disk.
Does this seem correct?