ADSM-L

Re: [ADSM-L] Client-side dedup speed

2017-05-22 13:58:13
Subject: Re: [ADSM-L] Client-side dedup speed
From: Del Hoobler <hoobler AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 22 May 2017 13:55:16 -0400
In the deduplication best practices paper it is called out that in some 
cases the client dedup cache will slow backups down. This is mentioned in 
section 1.2.3.2.  Then in section 4.3.1, the 
enablededupcache option is recommended to be set to 'no' in all cases 
except for backups across high latency networks and is recommend to always 
be set to 'no' for applications which use the Spectrum Protect API.


Del


> ----- Forwarded by Del Hoobler/Endicott/IBM on 05/21/2017 01:33 PM -----
> 
> From: "Loon, Eric van (ITOPT3) - KLM" <Eric-van.Loon AT KLM DOT COM>
> To: ADSM-L AT VM.MARIST DOT EDU
> Date: 05/19/2017 07:28 AM
> Subject: Re: Client-side dedup speed
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> 
> Hi Stefan!
> Thanks for your reply!
> I have done some testing on this client and I discovered that the 
> local deduplication cache is the bottleneck here. Apparently going 
> through the local cache will only be beneficial when it is not 
> getting too large. As soon as I disabled the local cache, backup 
> performance is improved drastically. As soon as the 
> RESOURCEUTILIZATION is raised (from 2 to 5) the performance is 
> improved even more!
> I'm a happy guy again, for those of you interested, I included the 
> statistics of all test down below.
> Kind regards,
> Eric van Loon
> Air France/KLM Storage Engineering
> 
> Server-side deduplication:
> Total number of objects backed up:    1,926,689
> Total number of bytes transferred:       227.43 GB
> Total data reduction ratio:                0.05%
> Elapsed processing time:               03:54:34
> 
> Client-side deduplication:
> Total number of objects backed up:    1,934,305
> Total objects deduplicated:             656,414
> Total bytes before deduplication:        229.95 GB
> Total bytes after deduplication:           1.31 GB
> Deduplication reduction:                  99.44%
> Elapsed processing time:               12:02:03
> 
> Client-side deduplication, ENABLEDEDUPCACHE NO:
> Total number of objects backed up:    1,956,018
> Total objects deduplicated:             661,502
> Total bytes before deduplication:        220.51 GB
> Total bytes after deduplication:              0  B
> Deduplication reduction:                 100.00%
> Total data reduction ratio:               99.67%
> Elapsed processing time:               04:20:33
> 
> Client-side deduplication, ENABLEDEDUPCACHE NO RESOURCEUTILIZATION 5: 
> Total number of objects backed up:    1,973,335
> Total objects deduplicated:             670,950
> Total bytes before deduplication:        226.77 GB
> Total bytes after deduplication:              0  B
> Deduplication reduction:                 100.00%
> Total data reduction ratio:               99.68%
> Elapsed processing time:               02:12:22
> 
> 
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> Behalf Of Stefan Folkerts
> Sent: zaterdag 13 mei 2017 20:40
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: Client-side dedup speed
> 
> Hi Eric,
> 
> I've seen client-side dedup be slower on 1Gb/s networks every time I
> test it, every single time.
> It doesn't surprise me at all just looking at the added latency to 
> check if a chunk is already in Spectrum Protect or not, sending it 
> via the network, network latency (big one), reading form the 
> database and than all the way back to the client has got to be a LOT
> slower than doing it all in the server, at least when the local 
> cache isn't very effective and with two million files I doubt it is 
> very effective (but I don't know this, haven't done enough testing 
> with different data source types).
> Network latency on a round trip alone probably adds a lot of time (%
> wise) to just a local SSD database read alone, SSD's have a latency 
> of 0.0something ms and everything stay's in compute with server side 
dedup.
> 
> I only use client-side dedup with WAN networks or when I test and 
> would see that the speeds are comparable, I think this would be due 
> to a very low latency network but I wonder if others here have done 
> more testing on this.
> I believe that when you have very heavy loads you can put more in 
> the server in an night of backups because there is a smaller load on
> the server because the clients do some work but again, I only use it
> when sending data thru WAN connections, for those scenario's it's 
> perfect especially in combination with 8.1 B/A client-side compression.
> 
> 
> 
> 
> On Fri, May 12, 2017 at 1:52 PM, Loon, Eric van (ITOPT3) - KLM < 
> Eric-van.Loon AT klm DOT com> wrote:
> 
> > Dear TSM-ers,
> > We are almost in production with our first directory containerpool TSM 

> > server. We did a lot of testing with client vs server-side dedup and 
> > overall client-side was faster. But not on one of our Linux clients. 
> > This machine has a filesystem with nearly two million files, total 
> > amount of data is 227 GB. The server backs up through 1Gb Ethernet and 

> > with server-side dedup the backup takes 03:54. With client-side dedup 
> > enabled, the backup runs for 12:02! The client contains 16 CPU's and 
> > average utilization is no more than 10%. No swapping during a backup.
> > I opened a PMR, but the support offices tries to convince me that 
> > client-side dedup is slower (twice as slow) by default. I really have 
> > a very hard time believing this. What is your experience with 
> > client-side dedup in combination with a large amount of smaller 
> (approx. 1 Mb) files?
> > Thanks for any help in advance!
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage Engineering
> > ********************************************************
> > For information, services and offers, please visit our web site:
> > http://www.klm.com. This e-mail and any attachment may contain 
> > confidential and privileged material intended for the addressee only. 
> > If you are not the addressee, you are notified that no part of the 
> > e-mail or any attachment may be disclosed, copied or distributed, and 
> > that any other action related to this e-mail or attachment is strictly 

> > prohibited, and may be unlawful. If you have received this e-mail by 
> > error, please notify the sender immediately by return e-mail, and 
> delete this message.
> >
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or 
> > its employees shall not be liable for the incorrect or incomplete 
> > transmission of this e-mail or any attachments, nor responsible 
> for any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal 
> > Dutch
> > Airlines) is registered in Amstelveen, The Netherlands, with 
> > registered number 33014286
> > ********************************************************
> >
> ********************************************************
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain 
> confidential and privileged material intended for the addressee 
> only. If you are not the addressee, you are notified that no part of
> the e-mail or any attachment may be disclosed, copied or 
> distributed, and that any other action related to this e-mail or 
> attachment is strictly prohibited, and may be unlawful. If you have 
> received this e-mail by error, please notify the sender immediately 
> by return e-mail, and delete this message. 
> 
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/
> or its employees shall not be liable for the incorrect or incomplete
> transmission of this e-mail or any attachments, nor responsible for 
> any delay in receipt. 
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal 
> Dutch Airlines) is registered in Amstelveen, The Netherlands, with 
> registered number 33014286
> ********************************************************

<Prev in Thread] Current Thread [Next in Thread>

ADSM.ORG Privacy and Data Security by KimLaw, PLLC