File size?

ldmwndletsm

ADSM.ORG Senior Member
Joined
Oct 30, 2019
Messages
232
Reaction score
5
Points
0
PREDATAR Control23

How can I determine the file size for an object when running from the server?

Every time I've played with that, it either never matches the size of the file on disk, or the command just hangs interminably. Maybe this gets into aggregates and such. I don't know, but as an example, when running 'q content volume f=d', the 'Stored Size' never matches, even if 'Aggregated' is No and Segment Number is 1/1. I looked at thobias.org, but very confusing, and I'm unclear if the answer is really in there. I'm not interested in the sum aggregate of all objects of a file space. I need the sizes of the constituent files.

Obviously, I can go to the client, but I might have to do this on multiple machines, so I would prefer to run this from the server. Clearly, the client doesn't store this information anywhere other than the client log, but that's not where it's pulling it from when running 'q backup path -detail'. In fact, I don't think the client even reports the size other than in the log, does it?

Anyway, I tried this from the server to report the file sizes and file names for every object under /filespace for nodename:

select backups.FILESPACE_NAME, HL_NAME, LL_NAME, ACTUAL_SIZE, BACKUP_DATE, contents.FILE_SIZE from backups, contents where backups.NODE_NAME='nodename' and backups.FILESPACE_NAME='/filespace' and backups.object_id=contents.object_id"

but it just sits there and never prints anything. If I remove the join, and only print information from one table or the other then it's as fast as a speeding car. But even if the above command ever does decide to finally report something, I'm dubious if the 'ACTUAL_SIZE' and/or FILE_SIZE is what I need here. I tried both, hence the join, but when using just ACTUAL_SIZE and no join, the value is always empty for any LL_NAME.

On a related note, why is the client able to report its information so quickly? For example, something like: q backup /path -detail? What tables or columns is it referencing on the server?
 
PREDATAR Control23

On a related note, why is the client able to report its information so quickly? For example, something like: q backup /path -detail? What tables or columns is it referencing on the server?
The server would definitely use the same data to provide to the client, one difference is that the queries would be optimized.
 
PREDATAR Control23

Okay, I checked, and yes, the client does provide the actual file size, but how? What column in which table is the server accessing in order to pass that information to the client?

How can I get that information from a select statement?
 
PREDATAR Control23

The BACKUPS table has a field called ACTUAL_SIZE.
 
PREDATAR Control23

Yes, but this field is ALWAYS empty. I have yet to see any evidence in any of my queries that it's populated. I have checked sundry file systems on a number of clients, and that's always the case.

I found this article from 2018: https://www.ibm.com/support/pages/size-backup-objects-appears-be-incorrect

So I did this:

1. I picked a file system and then determined which volumes contain it as:

select volume_name from volumeusage where filespace_name='filesystem'

2. I then picked one of the volumes and ran this:

query content volume f=d > output_file

3. I then picked one of the listed files in the output and ran this on the affected client to get the size on disk:

/bin/ls -ld /path2file

(63991 bytes)

4. However, this is what the query content output reports for that file, and there's only one entry for it:

Client's Name for File: /path2filename
Hexadecimal Client's Name for File:
Aggregated?: 2/2
Stored Size: 149,598
Segment Number: 1/1

Clearly, Stored Size and the value returned from 'ls -ld' do not match.

5. Next, I ran this:

select actual_size, filespace_name, hl_name, ll_name,object_id, backup_date from backups where node_name='nodename' and filespace_name='filesystem' and hl_name='/dirpath/' and ll_name='filename'

All fields with the exception of the ACTUAL_SIZE have values. I then used the returned OBJECT_ID as follows:

6. show invo OBJECT_ID

It returns a bunch of information, including a Size: 64512, HeaderSize: 442 and a super-bitfile value. But even if you subtract the headersize from the size, you still get a different value from 3 and 4 above.

7. I then ran show invo on the super-bitfile value, and it reports Bitfile Size: 149598 for both the primary and copy pool volumes, matching the stored size reported in 4 above.

How and the heck do you ever get a real size out of all of this nonsense? How on earth does the client dsmc command manage to ferret that out? It's simply befuddling.
 
PREDATAR Control23

The stored size is the space occupied on the media. Because of compression and sometimes aggregates, that's never matches the actual size of the file. It's never something I paid much attention to, getting the actual size has never been useful to me.

Keep in mind that most tables you see in Spectrum Protect aren't actual tables, they are often views to tables in DB2. So what you and I can query is very much different from what the server can query.
 
Top