ldmwndletsm
ADSM.ORG Senior Member
- Joined
- Oct 30, 2019
- Messages
- 232
- Reaction score
- 4
- Points
- 0

[ Question ]
Why is it that I can search the backups table so quickly, e.g.
select OBJECT_ID from backups where NODE_NAME='node1' and FILESPACE_NAME=' /filesystem1' and HL_NAME=' /dir1/' and STATE='ACTIVE_VERSION'"
This returns the desired column value super fast, every time, no doubt about it. BUT then I try this to get the volume(s) thus:
select object_id, volume_name from contents where node_name='node1' and filespace_name='/filesystem1' and object_id='value_returned_from_above'"
and it hangs interminably. How on earth is this table indexed? Is it actually crawling through the entire contents table brute force and NOT narrowing it down via indexing? What am I missing here? What more do I need to do to expedite the search?
[ Question ]
Does the backups table not have as many entries as contents? How do these tables really differ, anyway?
I mean, I can check syscat.columns and all that, but I really don't understand the difference between these tables in terms of their sizes. Thus far, every search I do on the backups table is really fast, even if it's "select *", as long as the "where" specifies enough columns, e.g. node_name and filespace_name minimally, but even just those two columns alone (never mind HL_NAME) seems more than sufficient to pull information out of the backups table with minimal pain (no latency), and cancelling seems atomic as well. Not so with contents.
[ What I'm trying to accomplish ]
I want to know what copy pool volumes are required to restore a directory. This is for periodic data integrity validation checks wherein the data is first restored from the primary pool tape(s) and then compared (including cryptographic checksums) against the same data restored from the copy tape(s). The purpose here is NOT to restore some damaged primary pool tape. This is to ensure no bit rot is occurring on off-site tapes. The activity log is NOT going to tell you the required copy pool volumes UNLESS only those are loaded (they are not) and any loaded primary tapes are set to 'unavailable'. So it will default to the primary pool.
Yes, I can play games with `show bfobject object_id` , superbit file, all that, to track down the primary and copy pool volumes, or I can run `query backup blah-blah-blah -querydetail` on the client to retrieve the volume ids (hex) reported in the Volumes Summary Statistics, convert to decimal and then look those up in the appropriate DB2 table (ss_volume_ids) to decipher what the actual volume names are, but that's circuitous and only works for the volumes that are available, so if it's only the primary, then that's all that 'querydetail' will report. If one or more of the required copy pool volumes was in the library, and the counterpart primary volume(s) was set to 'unavailable' then, yes, it would report the copy pool volume ids in hex. It adjusts the report of the required volume ids accordingly, based on the volumes that are nearline and not 'unavailable'. I've tested that. Could also just use 'detail', grep for the volume id line and pipe that to a unique sort. That avoids having to convert those values to decimal, but you still have to search the DB2 table to get the volume names. Otherwise, I could run `restore volume volume_name preview=yes` for each of the required primary volumes in order to determine which copy pool volumes would be needed. Don't know how long that would take, but it's not like this has to be completed in 10 minutes. I would have at least a day or so since it's somewhat contrived.
I want to avoid bringing back more copy pool volumes from off site than necessary in order to minimize risk. Clearly, `query nodedata node_name` is not going to narrow down the list appropriately.
Why is it that I can search the backups table so quickly, e.g.
select OBJECT_ID from backups where NODE_NAME='node1' and FILESPACE_NAME=' /filesystem1' and HL_NAME=' /dir1/' and STATE='ACTIVE_VERSION'"
This returns the desired column value super fast, every time, no doubt about it. BUT then I try this to get the volume(s) thus:
select object_id, volume_name from contents where node_name='node1' and filespace_name='/filesystem1' and object_id='value_returned_from_above'"
and it hangs interminably. How on earth is this table indexed? Is it actually crawling through the entire contents table brute force and NOT narrowing it down via indexing? What am I missing here? What more do I need to do to expedite the search?
[ Question ]
Does the backups table not have as many entries as contents? How do these tables really differ, anyway?
I mean, I can check syscat.columns and all that, but I really don't understand the difference between these tables in terms of their sizes. Thus far, every search I do on the backups table is really fast, even if it's "select *", as long as the "where" specifies enough columns, e.g. node_name and filespace_name minimally, but even just those two columns alone (never mind HL_NAME) seems more than sufficient to pull information out of the backups table with minimal pain (no latency), and cancelling seems atomic as well. Not so with contents.
[ What I'm trying to accomplish ]
I want to know what copy pool volumes are required to restore a directory. This is for periodic data integrity validation checks wherein the data is first restored from the primary pool tape(s) and then compared (including cryptographic checksums) against the same data restored from the copy tape(s). The purpose here is NOT to restore some damaged primary pool tape. This is to ensure no bit rot is occurring on off-site tapes. The activity log is NOT going to tell you the required copy pool volumes UNLESS only those are loaded (they are not) and any loaded primary tapes are set to 'unavailable'. So it will default to the primary pool.
Yes, I can play games with `show bfobject object_id` , superbit file, all that, to track down the primary and copy pool volumes, or I can run `query backup blah-blah-blah -querydetail` on the client to retrieve the volume ids (hex) reported in the Volumes Summary Statistics, convert to decimal and then look those up in the appropriate DB2 table (ss_volume_ids) to decipher what the actual volume names are, but that's circuitous and only works for the volumes that are available, so if it's only the primary, then that's all that 'querydetail' will report. If one or more of the required copy pool volumes was in the library, and the counterpart primary volume(s) was set to 'unavailable' then, yes, it would report the copy pool volume ids in hex. It adjusts the report of the required volume ids accordingly, based on the volumes that are nearline and not 'unavailable'. I've tested that. Could also just use 'detail', grep for the volume id line and pipe that to a unique sort. That avoids having to convert those values to decimal, but you still have to search the DB2 table to get the volume names. Otherwise, I could run `restore volume volume_name preview=yes` for each of the required primary volumes in order to determine which copy pool volumes would be needed. Don't know how long that would take, but it's not like this has to be completed in 10 minutes. I would have at least a day or so since it's somewhat contrived.
I want to avoid bringing back more copy pool volumes from off site than necessary in order to minimize risk. Clearly, `query nodedata node_name` is not going to narrow down the list appropriately.