ADSM-L

Re: [ADSM-L] Client-side dedup speed

2017-05-27 18:07:49
Subject: Re: [ADSM-L] Client-side dedup speed
From: Del Hoobler <hoobler AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sat, 27 May 2017 18:06:31 -0400
Hi Robert,

These recommendations are for directory and cloud container pools which 
always use deduplication.  They also happened to be our recommendations 
for legacy FILE deduplication. 


Del

----------------------------------------------------

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 05/23/2017 
12:52:07 AM:

> From: "rouzen AT univ.haifa.ac DOT il" <rouzen AT UNIV.HAIFA.AC DOT IL>
> To: ADSM-L AT VM.MARIST DOT EDU
> Date: 05/23/2017 12:52 AM
> Subject: Re: Client-side dedup speed
> Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> 
> Hello Del
> 
> Did it's working too with Storage type DIRECTORY  when on it 
> deduplicate data is YES ? 
> 
> TSM server Version 8.1.1.0
> 
> Best Regards Robert
> 
> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> Behalf Of Del Hoobler
> Sent: Monday, May 22, 2017 8:55 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: [ADSM-L] Client-side dedup speed
> 
> In the deduplication best practices paper it is called out that in 
> some cases the client dedup cache will slow backups down. This is 
> mentioned in section 1.2.3.2.  Then in section 4.3.1, the 
> enablededupcache option is recommended to be set to 'no' in all 
> cases except for backups across high latency networks and is 
> recommend to always be set to 'no' for applications which use the 
> Spectrum Protect API.
> 
> 
> Del
> 
> 
> > ----- Forwarded by Del Hoobler/Endicott/IBM on 05/21/2017 01:33 PM 
> > -----
> > 
> > From: "Loon, Eric van (ITOPT3) - KLM" <Eric-van.Loon AT KLM DOT COM>
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Date: 05/19/2017 07:28 AM
> > Subject: Re: Client-side dedup speed
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> > 
> > Hi Stefan!
> > Thanks for your reply!
> > I have done some testing on this client and I discovered that the 
> > local deduplication cache is the bottleneck here. Apparently going 
> > through the local cache will only be beneficial when it is not getting 

> > too large. As soon as I disabled the local cache, backup performance 
> > is improved drastically. As soon as the RESOURCEUTILIZATION is raised 
> > (from 2 to 5) the performance is improved even more!
> > I'm a happy guy again, for those of you interested, I included the 
> > statistics of all test down below.
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage Engineering
> > 
> > Server-side deduplication:
> > Total number of objects backed up:    1,926,689
> > Total number of bytes transferred:       227.43 GB
> > Total data reduction ratio:                0.05%
> > Elapsed processing time:               03:54:34
> > 
> > Client-side deduplication:
> > Total number of objects backed up:    1,934,305
> > Total objects deduplicated:             656,414
> > Total bytes before deduplication:        229.95 GB
> > Total bytes after deduplication:           1.31 GB
> > Deduplication reduction:                  99.44%
> > Elapsed processing time:               12:02:03
> > 
> > Client-side deduplication, ENABLEDEDUPCACHE NO:
> > Total number of objects backed up:    1,956,018
> > Total objects deduplicated:             661,502
> > Total bytes before deduplication:        220.51 GB
> > Total bytes after deduplication:              0  B
> > Deduplication reduction:                 100.00%
> > Total data reduction ratio:               99.67%
> > Elapsed processing time:               04:20:33
> > 
> > Client-side deduplication, ENABLEDEDUPCACHE NO RESOURCEUTILIZATION 5: 
> > Total number of objects backed up:    1,973,335
> > Total objects deduplicated:             670,950
> > Total bytes before deduplication:        226.77 GB
> > Total bytes after deduplication:              0  B
> > Deduplication reduction:                 100.00%
> > Total data reduction ratio:               99.68%
> > Elapsed processing time:               02:12:22
> > 
> > 
> > -----Original Message-----
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> > Behalf 
> > Of Stefan Folkerts
> > Sent: zaterdag 13 mei 2017 20:40
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Subject: Re: Client-side dedup speed
> > 
> > Hi Eric,
> > 
> > I've seen client-side dedup be slower on 1Gb/s networks every time I 
> > test it, every single time.
> > It doesn't surprise me at all just looking at the added latency to 
> > check if a chunk is already in Spectrum Protect or not, sending it via 

> > the network, network latency (big one), reading form the database and 
> > than all the way back to the client has got to be a LOT slower than 
> > doing it all in the server, at least when the local cache isn't very 
> > effective and with two million files I doubt it is very effective (but 

> > I don't know this, haven't done enough testing with different data 
> > source types).
> > Network latency on a round trip alone probably adds a lot of time (%
> > wise) to just a local SSD database read alone, SSD's have a latency of 

> > 0.0something ms and everything stay's in compute with server side
> dedup.
> > 
> > I only use client-side dedup with WAN networks or when I test and 
> > would see that the speeds are comparable, I think this would be due to 

> > a very low latency network but I wonder if others here have done more 
> > testing on this.
> > I believe that when you have very heavy loads you can put more in the 
> > server in an night of backups because there is a smaller load on the 
> > server because the clients do some work but again, I only use it when 
> > sending data thru WAN connections, for those scenario's it's perfect 
> > especially in combination with 8.1 B/A client-side compression.
> > 
> > 
> > 
> > 
> > On Fri, May 12, 2017 at 1:52 PM, Loon, Eric van (ITOPT3) - KLM < 
> > Eric-van.Loon AT klm DOT com> wrote:
> > 
> > > Dear TSM-ers,
> > > We are almost in production with our first directory containerpool 
> > > TSM
> 
> > > server. We did a lot of testing with client vs server-side dedup and 

> > > overall client-side was faster. But not on one of our Linux clients.
> > > This machine has a filesystem with nearly two million files, total 
> > > amount of data is 227 GB. The server backs up through 1Gb Ethernet 
> > > and
> 
> > > with server-side dedup the backup takes 03:54. With client-side 
> > > dedup enabled, the backup runs for 12:02! The client contains 16 
> > > CPU's and average utilization is no more than 10%. No swapping 
> during a backup.
> > > I opened a PMR, but the support offices tries to convince me that 
> > > client-side dedup is slower (twice as slow) by default. I really 
> > > have a very hard time believing this. What is your experience with 
> > > client-side dedup in combination with a large amount of smaller
> > (approx. 1 Mb) files?
> > > Thanks for any help in advance!
> > > Kind regards,
> > > Eric van Loon
> > > Air France/KLM Storage Engineering
> > > ********************************************************
> > > For information, services and offers, please visit our web site:
> > > http://www.klm.com. This e-mail and any attachment may contain 
> > > confidential and privileged material intended for the addressee 
only.
> > > If you are not the addressee, you are notified that no part of the 
> > > e-mail or any attachment may be disclosed, copied or distributed, 
> > > and that any other action related to this e-mail or attachment is 
> > > strictly
> 
> > > prohibited, and may be unlawful. If you have received this e-mail by 

> > > error, please notify the sender immediately by return e-mail, and
> > delete this message.
> > >
> > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries 
> > > and/or its employees shall not be liable for the incorrect or 
> > > incomplete transmission of this e-mail or any attachments, nor 
> > > responsible
> > for any delay in receipt.
> > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal 
> > > Dutch
> > > Airlines) is registered in Amstelveen, The Netherlands, with 
> > > registered number 33014286
> > > ********************************************************
> > >
> > ********************************************************
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain 
> > confidential and privileged material intended for the addressee only. 
> > If you are not the addressee, you are notified that no part of the 
> > e-mail or any attachment may be disclosed, copied or distributed, and 
> > that any other action related to this e-mail or attachment is strictly 

> > prohibited, and may be unlawful. If you have received this e-mail by 
> > error, please notify the sender immediately by return e-mail, and 
> > delete this message.
> > 
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ or 

> > its employees shall not be liable for the incorrect or incomplete 
> > transmission of this e-mail or any attachments, nor responsible for 
> > any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal 
> > Dutch Airlines) is registered in Amstelveen, The Netherlands, with 
> > registered number 33014286
> > ********************************************************
> 

<Prev in Thread] Current Thread [Next in Thread>

ADSM.ORG Privacy and Data Security by KimLaw, PLLC