• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

8.1.12 Dedup Node Tier (Block Size )

dietmar

Active Newcomer
Joined
Mar 12, 2019
Messages
5
Reaction score
1
Points
0
Hi,

Just because i think it is a very important update to let all know .

Container pools do Dedup Tier Levels and the minium Tier Level is 50kb see the new update node command for 8.1.12 :

MINIMUMExtentsize :

Specifies the extent size that is used during data deduplication operations for cloud-container storage pools and directory-container storage pools on this node. In most system environments, the default value of 50 KB is appropriate. However, if you plan to deduplicate data from an Oracle or SAP database, and the average extent size is less than 100 KB, you can help optimize performance by specifying a larger extent size. Data in Oracle and SAP databases is typically deduplicated with extent sizes that are much smaller than the default average size of 256 KB. Small extent sizes can negatively affect the performance of backup and expiration operations and can result in unnecessary growth of the IBM Spectrum Protect server database.

We found out this is a big problem if you are going for TDP for VE. Because all TDP Data is on 50KB ( Because the Server see's only Blocks in 1MB Sizes, and the Client Developer do not talk to the Server Developer) .

So if you store like 100TB of Data on VM Backup's it is getting very ugly to copy/restore them from Tape. Even the Replication is slow in that case. We found that because we use multible Containers and wonder why the performance is so different .

This is of course if you do Blue Print configs with big SATA Drives which gives u around 70 IO/s . So in this case we had about 12000 IOPS in total ( if all Disks are uses in parallel on all LUN's ) which does not work because the SW it is optimized on filling used Containers first ( space vs performance ) .

We have changed the TDP DC Nodes to 250kb and see big improvements in protect and replication performance . We have more VMWARE STG Pool Usage now (+12% ) and the DB Usage did decrease a lot which is also cool. Keep in mind all chunks will be "new" because of the size. So it depends a lot on the copygroup and in case of VMWare i think the last FULL. I expect that the STGPOOL will lower down a bit more, but it went up to +25% strait away as we have done bulk full backups .

Also take in mind if you have copy container pool and want to restore them in case of a disaster. I have seen the restore of a container spreads good to all LUN's attached, but it does help very little if it restores them in 50kb chunks . I have done some real and test disaster restores of Containers and it was always slow if we compared what the Tape Drive could deliver and we thought just stream that sh.t to Disk. No it does not work that way ... I have seen this also on File Level (File Server ) nodes and increased the extend size there as well.

There is a SQL which gives the average size of the junks for a container pool. With this Info + your Backend IOPS it could give a little guess now long it will take to restore . Of course with all other optimium's like DB2 on NVMe , a lot of Cores with high frequency .

db2 "select avg(cast(length as bigint)) from sd_chunk_locations where poolid=XXXXX for read only with ur"
( takes some time to finish, output is in bytes )

poolid = XXX ( show sdpool )

The bottom is of course dedup is not as good anymore and therefore more pool space . But the more Data is much better to process / handle.

In our case having different Pools it was good to check . I have got no Info from IBM Support how to query the average Extend Size used per Node .

This is also a good point if you have an extra large DB2 . Just changing TDP DC Nodes which holds appr 30% of all Data stored reduced the Size from around 275.000.000 to 244.906.466 Pages. ( No problem to handle 3TB+ , just this example ) . The DB2 did also increase to 282.000.000 Pages temp.

Just image 5 time less pointers in the DB2 for TDP Data ( 250 vs 50 ) , and bigger Backup Data chunks to process. I think this is a must have to get a good performing ISP Installation .

Hope this help others.

Br, Dietmar

D&C IT Consulting
 

Rigido

ADSM.ORG Senior Member
Joined
Apr 21, 2006
Messages
140
Reaction score
6
Points
0
Location
Rome, Italy
Hi Dietmar,
I'm going through a "VTL replacement with DC", SP 8.1.12 (maybe 100 soon) on Power LPAR and AIX 7.2.
The first SP server I'm going to work with is dedicated to SAP HANA backup and some databases are bigger than 1TB. Should I increase ExtentSize? It is not clear if I should set it to 256K.
 

dietmar

Active Newcomer
Joined
Mar 12, 2019
Messages
5
Reaction score
1
Points
0
Hi,

This is only the "minimum Extend size" . So if the Source Data is "bigger" it would not change anything.

I asked IBM Support to have the minimum as a global for all Nodes, because i find it not useful at ALL to save Data in 50kb in any case . But they did not recommend to do in an existing Install .

Currently i see only 1 "negativ" point in changing. Less Dedup and more Space on the Containers. But we see only 12% increase where ALL Data where 50kb in size . So if there are mixxed Tier Level saves in different block sizes i assume it will be less .

As described there are a lot of positive things :

* smaller DB Size
* much faster processing time for protect network/tape ( maybe even for classic backup/restores )

Maybe "someone" have a better idea how sap hana (backint/prole ) saves their files ( size ) and therefore how it ends in the container .

If i would do a new installation, i would go for the 250kb settings for the node .

I also find it a good idea to have multible Containers for different "sources" . Special if you do a protect local to Tape. So at least u could decide what to restore first back from Tape . So all my install's @ Customers do have a dedicated Pool for VMWARE , or Databases .

Also Databases itself do compression and do very little dedup on the ISP Container. So this pool might also be a candidate to disable dedup/compression which also helps ....

If you have a Container Pool already u could check with the mentioned sql above how your average chunk size is ... U even might to start the new ISP Server with standard and do some backup tests - check with the db2 sql and u will see the block size used :) .....

good luck and let us know :) .

br, Dietmar
 

Rigido

ADSM.ORG Senior Member
Joined
Apr 21, 2006
Messages
140
Reaction score
6
Points
0
Location
Rome, Italy
Thank you for your quick response,
we just started with activities and we had first backup on DC last night, I think I can change it with no issues.
Should I run the select from the ISP prompt or from a DB2 command line?
(ISP prompt tells me "System" admin is not authorized...).

Ciao!
 

dietmar

Active Newcomer
Joined
Mar 12, 2019
Messages
5
Reaction score
1
Points
0
This is why we decided to work on an IBM SR . The Protect to was also notable different for this Pool , and therefore a much better candidate to work on. We cannot just repair/restore a Pool for testing again just for the Case . But it will be done again this year . ( with increased extend sizes on the nodes )

ANR4982I The repair storage pool process for XXXXXX on
server XXXXXXX from XXXXXX on server XXXXXXXX is
complete. Extents repaired: 119427858 of 119427858.
Extents failed: 0. Extents skipped: 0. Amount repaired:
6,983 GB of 6,983 GB. Amount failed: 0 bytes. Amount
skipped: 0 bytes. Elapsed time: 0 Days, 8 Hours, 3
Minutes. (SESSION: 30)
ANR0986I Process 1119 for Repair Stgpool running in the
BACKGROUND processed 119,427,858 items for a total of
7,498,253,774,403 bytes with a completion state of
SUCCESS at 01:08:42. (SESSION: 30)

LTO7 - 6 Drives . DB2 on NVMe . Intel Gold Xenon 48 Cores ...

Nobody understands without knowing that it needs 8 hours for 7 TB Restores ...

I have done a restore of a another pool with bigger chunks and better IOPS on the Backend which did 48 TB in 20 hours. ( other Customer ) . ( old 5030 which supported RAID 10 on 6tb sata ) . Which was good i think.

You see the bytes and the items/chunks . So you could just devide bytes by items to get the average chunk size used ... ( check your current protect stgpool, it should also give an idea about the size of the currently protected data as it shows also items/chunks and GB )

During this Desaster Test we saw IOPS up to 11-13K which maxed out the 192x6 TB SATA Total IO .

The Repair/Restore of the Container does not stream . It seems that it takes an amount of Data to process from the DB2 and then works on them until finished and then to start again until all is done ( Audit Table ) .

So also a clean an reorg'ed DB2 helps here as well .....

As i have gone thru hell, i could just tell u .

* Do DR Tests ( having new Hardware is a good candidate to DR Test the existing )
* Use Tape

Br, Dietmar
 

dietmar

Active Newcomer
Joined
Mar 12, 2019
Messages
5
Reaction score
1
Points
0
db2cmd ( or i guess su - db2instance owner on linux/aix )

db2 connect to TSMDB1
db2 set schema TSMDB1
db2 "select ...."

...

i just checked the sap hana pool where i have access to . ( not big )

Deduplication Savings: 7,738 G (26,56%)
Compression Savings: 10,754 G (50,27%)
Total Space Saved: 18,492 G (63,48%)

The current running protect today

ANR4980I The protect storage pool process for STG_XXX on server XXXXXX to STG_XXX on server XXXXX is complete. Extents protected: 645816 of 645816. Extents failed to protect: 0. Extents deleted: 441030 of 441030. Amount protected: 193 GB of 193 GB. Amount failed: 0 bytes. Amount transferred: 188 GB. Elapsed time: 0 Days, 0 Hours, 19 Minutes. (SESSION: 2633535, PROCESS: 4987)

So the average size is appr 320kb here ....
(193 Gigabytes = 207232172032 Bytes / 645816 Protected )

Therefore i guess there will be no or very little change if setting is set to 250kb ....

br, Dietmar
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

UpCloud high performance VPS at $5/month

Get started with $25 in credits on Cloud Servers. You must use link below to receive the credit. Use the promo to get upto 5 month of FREE Linux VPS.

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 20 18.9%
  • Keep using TSM for Spectrum Protect.

    Votes: 64 60.4%
  • Let's be formal and just say Spectrum Protect

    Votes: 13 12.3%
  • Other (please comement)

    Votes: 9 8.5%

Forum statistics

Threads
31,837
Messages
135,770
Members
21,788
Latest member
london
Top