Confusion over compressalways option?

ldmwndletsm

ADSM.ORG Senior Member
Joined
Oct 30, 2019
Messages
232
Reaction score
5
Points
0
PREDATAR Control23

Can someone confirm the IBM documentation?

https://www.ibm.com/support/knowled...tsm.perf.doc/r_opt_client_compressalways.html

The second paragraph where it says: "To reduce the impact of repeated compression attempts if the compressed file is larger than the original, specify compressalways yes. "

Don't they mean 'no'? The following IBM document suggests to me that it should be 'no': https://www.ibm.com/support/knowledgecenter/SSEQVQ_8.1.0/client/r_opt_compressalways.html

Is that a typo?

I thought I'd read somewhere else too wherein the documentation reads 'yes', and I would think it would be 'no', but I can't find it now. Will post if I can locate it. Thanks.
 
PREDATAR Control23

Hi,

I do not have the answer based on doc's, but I can write something about it. The file is checked for compressability. I am not sure how much of the file is read before a conclusion is made. It can be 1 MByte, or 5% of the file or something else. IBM should be able to provide a technical answer there.

There are a few different scenarios to evaluate.

  1. File begins with a non-compressible data, rest is highly compressible
  2. File begins with a highly compressible part, rest is not compressible
  3. File is not compressible
  4. All file is compressible
To get the most out of your storage, configure as below.

  1. This could trick SP to skip compression since the first part does not compress. This is where Compression yes, compressalways yes should be used. Or include.compresion.
  2. Exclude compression for filetype/directory. You will otherwise waste cpu to save nothing. Exclude.compression.
  3. Exclude. compression
  4. Include.compression
To get the most out of this, you need to know your clients data types. Works well with small systems, but for 1000's of nodes, you could end up with micro management.

If you have enabled dedup, then there is a new set to look at. Each file is split into chunks, and each of these are evaluated. Probably as above.
 
PREDATAR Control23

The second paragraph where it says: "To reduce the impact of repeated compression attempts if the compressed file is larger than the original, specify compressalways yes. "
That statement is correct. With compressalways yes, it will send them compressed all the time, so there's no repeated attempts to compress, then resend uncompressed if larger. With compressalways no, the client will attempt to compress files, and if they grow bigger it will send the file uncompressed instead.
 
PREDATAR Control23

So if this is set to 'no' then how many times does it attempt to compress before it realizes it's a failed venture? If it compresses it, and the result is larger, why would it reattempt it?
 
PREDATAR Control23

Hi,
I think it will only try once. If file grows during compression, ie reading a non-compressable file, it will stop and start again sending all file data uncompressed.
 
PREDATAR Control23

So if this is set to 'no' then how many times does it attempt to compress before it realizes it's a failed venture? If it compresses it, and the result is larger, why would it reattempt it?
Once per file, the repeat comes from processing thousands of files during a backup. And also day after day when rebacking up those same files as they change.
 
PREDATAR Control23

Once per file, the repeat comes from processing thousands of files during a backup. And also day after day when rebacking up those same files as they change.

I must be missing something major here. If compression is enabled, and TSM decides that a file needs to be backed up then why would this burden not also occur with the compressalways option set to yes? If it's set to yes then it must compress it, and it must send the result regardless of the compressed size; otherwise, if set to no, then it must also compress since that's the only way it will know if it's smaller than the uncompressed version. Now, whether it then sends it uncompressed or not obviously depends on the results. But regardless of the setting, compression will be attempted. I fail to see how either setting is different in terms of the work load on the client other than the fact that if it's set to no then it must do a subsequent size compare. All those milliseconds for those compares might add up. Is that what you're saying?

Otherwise, the only thing I could possibly see (or imagine) is that perhaps TSM keeps track of the setting (yes or no) for each backed up file?, and if it sees that the file was sent uncompressed the last time, when this option was set to no, then if this is still set to no then it won't bother to retry to compress it again the next time it backs it up, so it will just send it uncompressed, thus saving time? Otherwise, if it was sent compressed the last time, when it was set to yes, then it sends it compressed again. But if any of this is the case then it would seem that having it set to no would be more efficient. All of this seems unlikely to me based on what I read in the documentation, but then again, I haven't always found the documentation clear on a number of points.
 
PREDATAR Control23

All those milliseconds for those compares might add up. Is that what you're saying?
Essentially, yes. And if the files are large, it will take more than a few milliseconds to compress it to determine if it grew or not.

Otherwise, the only thing I could possibly see (or imagine) is that perhaps TSM keeps track of the setting (yes or no) for each backed up file?
No.
 
Top