Opinions and questions on client side compression versus drive compression?

ldmwndletsm

ADSM.ORG Senior Member
Joined
Oct 30, 2019
Messages
232
Reaction score
5
Points
0
We're using IBM LTO-6 drives. TSM 8.1.3 on Linux. I'm new to this environment (different work site) since I've been managing an EMC NetWorker environment (not the same work site), also with IBM LTO-6 drives where we've used only drive compression, not client. But in my TSM environment, client side compression has been the standard.

1. Is client side compression only set via the 'Compression' setting for the node definition on the TSM server? Nowhere else?

2. If you want to turn on/off compression for a specific node then does this require having to start/stop the client software on that machine, and/or the TSM server, after you update the setting for the node configuration?

3. Is there anything to be aware of when disabling/re-enabling this feature?

4. If the data was compressed on the client, and this option is later turned off for the node, then will it still un-compress it?

5. What if drive compression is enabled, and the data was already compressed first on the client?

Will the drive be smart enough to see that it's already compressed and just write to tape as is? Or will it try to further compress it, possibly wasting valuable time, particularly on very large files?

The reason I ask here is that it's my understanding that drive compression is usually enabled by the manufacturer and will remain so unless you overtly turn if it off. I've done this before, just as a temporary test, using the 'mt' command, but I don't recall now the command to probe the drive to report whether compression is enabled or not. Anyway, with EMC NetWorker, I've relied exclusively on drive compression since our network was reasonably fast, and I was of the opinion (maybe a false one) that drive/hardware compression was more efficient.

But I'm curious about TSM. Is it using the same compression algorithm for client side compresssion as the IBM drive compression? If so then it would seem that no further compression would be eked out; otherwise, maybe it would squeeze a little more, thus taking longer overall and creating two hoops to jump through to un-compress it when restoring it?

6. With TSM, I'm concerned that if we use compression on the client side, and compression is also enabled on the drives, then it might slow things up? Is this a legit concern?

We have a large collection of data that does not compress well since it has a high percentage of binary data (but also some text). This lives on a client wherein we have two stanzas or nodes. One stanza (node A) will handle this data, sending it to a different storage pool and management class, while the other stanza (node B) will handle the rest (includes a little bit of everything), a lot of which does compress well. That data will be managed by a separate management class and will be be written to a different storage pool. Client side compression could be enabled for one node and disabled for the other. Anyway, with EMC, I've never worried about it, just sending all data regardless to the drives, letting them handle it. Who knows, maybe they expand the binary data a little.

7. Does anyone have any advice on VMs? Should we turn off client compression on those nodes?

It seems that a VM is going to be using a lot of CPU and memory, and if it then has to do a lot of compression on top of that then that might create woes and slow things up? Might it be better to just send the data on those to the drives uncomoressed, allowing the drive hardware compression to deal with it instead?
 
1. Is client side compression only set via the 'Compression' setting for the node definition on the TSM server? Nowhere else?
It's actually set on the client with the COMPRESSION option, dsmc help compression for more info. Updating the compression on the node on the server is just to give the node permission or not to compress, it doesn't enable it.
2. If you want to turn on/off compression for a specific node then does this require having to start/stop the client software on that machine, and/or the TSM server, after you update the setting for the node configuration?
Any changes to client options require a client restart unless the scheduler is managed by CAD.
4. If the data was compressed on the client, and this option is later turned off for the node, then will it still un-compress it?
The option affects if data is compressed of not at the time of backup. Once backed up it remains in the state until it expires.
5. What if drive compression is enabled, and the data was already compressed first on the client?
Can't compress compressed data, it's stored as-is. No different then when backing up .zip, .jpg, .mp3, etc.
6. With TSM, I'm concerned that if we use compression on the client side, and compression is also enabled on the drives, then it might slow things up? Is this a legit concern?
See answer to 5
7. Does anyone have any advice on VMs? Should we turn off client compression on those nodes?
It depends. Client side compression takes additional CPU cycles, but reduces amount of data to transfer over the network. Nothing beats a test on your own environment with both compression enabled and disabled and see what works best for you.
 
It's actually set on the client with the COMPRESSION option, dsmc help compression for more info. Updating the compression on the node on the server is just to give the node permission or not to compress, it doesn't enable it.

Well, I hunted around, and I see a number of nodes wherein the server has the value: 'Compression: Yes', but when I check the dsm.sys file for the given client, there's no line in any stanza to the effect of: 'compression yes' or even 'compressalways no' or 'yes' or anything with the string 'compression'. Moreover, there's no separate include-exclude file referenced either. Am I to believe, therefore, that compression is not being set on those clients? Or is it the case that it may have been activated from the command line? I see this in the dsmc help:

Options file:
compression yes

Command line:
-compression=no

This option is valid only on the initial command line. It is not valid in interactive mode.

Does this mean that you cannot set 'compresion yes' and/or 'compressalways yes' in the dsmc command prompt? How do you set these on the command line?

Also, if you don't have these in the dsm.sys file, and you instead set them on the command line, then the next time you stop/restart the client and/or reboot the machine will you lose these settings and have to reset them? Or are these persistent between restarts/reboots?


It depends. Client side compression takes additional CPU cycles, but reduces amount of data to transfer over the network. Nothing beats a test on your own environment with both compression enabled and disabled and see what works best for you.

Yes, we will try to test that.
 
Well, I hunted around, and I see a number of nodes wherein the server has the value: 'Compression: Yes',
That means the server has given permission to the client to use compression.
but when I check the dsm.sys file for the given client, there's no line in any stanza to the effect of: 'compression yes' or even 'compressalways no' or 'yes' or anything with the string 'compression'. Moreover, there's no separate include-exclude file referenced either. Am I to believe, therefore, that compression is not being set on those clients?
Correct, it's not set.
Does this mean that you cannot set 'compresion yes' and/or 'compressalways yes' in the dsmc command prompt? How do you set these on the command line?
No, it means that when used at the command prompt you can only pass it when you invoke "dsmc -options" (initial), and not at the Protect> prompt (interactive): "Protect> inc -compression=no" would not work. And you only use options in the command line when you want to use something other than what's already set for just this run. You can also put it in the option file as you pasted above.
Also, if you don't have these in the dsm.sys file, and you instead set them on the command line, then the next time you stop/restart the client and/or reboot the machine will you lose these settings and have to reset them? Or are these persistent between restarts/reboots?
Only options in the option file are persistent, options passed at the command prompt are not.
 
Correct, it's not set.

Okay, so unless I missed something then these clients are not compressing their data before sending it to the server. The only way I could see that this could be occurring would be if maybe there was some cron script that did it via the dsmc command, but I see no evidence of this.

No, it means that when used at the command prompt you can only pass it when you invoke "dsmc -options" (initial), and not at the Protect> prompt (interactive): "Protect> inc -compression=no" would not work. And you only use options in the command line when you want to use something other than what's already set for just this run. You can also put it in the option file as you pasted above.

Only options in the option file are persistent, options passed at the command prompt are not.

I've not used the command line. I've only used dsmc interactively, but that answers that and all makes sense.
 
Back
Top