ADSM-L

Re: [ADSM-L] Client-side dedup speed

2017-05-28 03:20:30
Subject: Re: [ADSM-L] Client-side dedup speed
From: adsm consulting <adsmcons AT GMAIL DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Sun, 28 May 2017 09:18:03 +0200
Hi,
but for example in VE backup (with legacy dedup stgpool - no container), i
do client side dedup that combine also compression for best overall space
saving.
With deduplication server side i completely lose the benefits of
compression.

On Sun, May 28, 2017 at 12:06 AM, Del Hoobler <hoobler AT us.ibm DOT com> wrote:

> Hi Robert,
>
> These recommendations are for directory and cloud container pools which
> always use deduplication.  They also happened to be our recommendations
> for legacy FILE deduplication.
>
>
> Del
>
> ----------------------------------------------------
>
> "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 05/23/2017
> 12:52:07 AM:
>
> > From: "rouzen AT univ.haifa.ac DOT il" <rouzen AT UNIV.HAIFA.AC DOT IL>
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Date: 05/23/2017 12:52 AM
> > Subject: Re: Client-side dedup speed
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> >
> > Hello Del
> >
> > Did it's working too with Storage type DIRECTORY  when on it
> > deduplicate data is YES ?
> >
> > TSM server Version 8.1.1.0
> >
> > Best Regards Robert
> >
> > -----Original Message-----
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On
> > Behalf Of Del Hoobler
> > Sent: Monday, May 22, 2017 8:55 PM
> > To: ADSM-L AT VM.MARIST DOT EDU
> > Subject: Re: [ADSM-L] Client-side dedup speed
> >
> > In the deduplication best practices paper it is called out that in
> > some cases the client dedup cache will slow backups down. This is
> > mentioned in section 1.2.3.2.  Then in section 4.3.1, the
> > enablededupcache option is recommended to be set to 'no' in all
> > cases except for backups across high latency networks and is
> > recommend to always be set to 'no' for applications which use the
> > Spectrum Protect API.
> >
> >
> > Del
> >
> >
> > > ----- Forwarded by Del Hoobler/Endicott/IBM on 05/21/2017 01:33 PM
> > > -----
> > >
> > > From: "Loon, Eric van (ITOPT3) - KLM" <Eric-van.Loon AT KLM DOT COM>
> > > To: ADSM-L AT VM.MARIST DOT EDU
> > > Date: 05/19/2017 07:28 AM
> > > Subject: Re: Client-side dedup speed
> > > Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> > >
> > > Hi Stefan!
> > > Thanks for your reply!
> > > I have done some testing on this client and I discovered that the
> > > local deduplication cache is the bottleneck here. Apparently going
> > > through the local cache will only be beneficial when it is not getting
>
> > > too large. As soon as I disabled the local cache, backup performance
> > > is improved drastically. As soon as the RESOURCEUTILIZATION is raised
> > > (from 2 to 5) the performance is improved even more!
> > > I'm a happy guy again, for those of you interested, I included the
> > > statistics of all test down below.
> > > Kind regards,
> > > Eric van Loon
> > > Air France/KLM Storage Engineering
> > >
> > > Server-side deduplication:
> > > Total number of objects backed up:    1,926,689
> > > Total number of bytes transferred:       227.43 GB
> > > Total data reduction ratio:                0.05%
> > > Elapsed processing time:               03:54:34
> > >
> > > Client-side deduplication:
> > > Total number of objects backed up:    1,934,305
> > > Total objects deduplicated:             656,414
> > > Total bytes before deduplication:        229.95 GB
> > > Total bytes after deduplication:           1.31 GB
> > > Deduplication reduction:                  99.44%
> > > Elapsed processing time:               12:02:03
> > >
> > > Client-side deduplication, ENABLEDEDUPCACHE NO:
> > > Total number of objects backed up:    1,956,018
> > > Total objects deduplicated:             661,502
> > > Total bytes before deduplication:        220.51 GB
> > > Total bytes after deduplication:              0  B
> > > Deduplication reduction:                 100.00%
> > > Total data reduction ratio:               99.67%
> > > Elapsed processing time:               04:20:33
> > >
> > > Client-side deduplication, ENABLEDEDUPCACHE NO RESOURCEUTILIZATION 5:
> > > Total number of objects backed up:    1,973,335
> > > Total objects deduplicated:             670,950
> > > Total bytes before deduplication:        226.77 GB
> > > Total bytes after deduplication:              0  B
> > > Deduplication reduction:                 100.00%
> > > Total data reduction ratio:               99.68%
> > > Elapsed processing time:               02:12:22
> > >
> > >
> > > -----Original Message-----
> > > From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On 
> > > Behalf
> > > Of Stefan Folkerts
> > > Sent: zaterdag 13 mei 2017 20:40
> > > To: ADSM-L AT VM.MARIST DOT EDU
> > > Subject: Re: Client-side dedup speed
> > >
> > > Hi Eric,
> > >
> > > I've seen client-side dedup be slower on 1Gb/s networks every time I
> > > test it, every single time.
> > > It doesn't surprise me at all just looking at the added latency to
> > > check if a chunk is already in Spectrum Protect or not, sending it via
>
> > > the network, network latency (big one), reading form the database and
> > > than all the way back to the client has got to be a LOT slower than
> > > doing it all in the server, at least when the local cache isn't very
> > > effective and with two million files I doubt it is very effective (but
>
> > > I don't know this, haven't done enough testing with different data
> > > source types).
> > > Network latency on a round trip alone probably adds a lot of time (%
> > > wise) to just a local SSD database read alone, SSD's have a latency of
>
> > > 0.0something ms and everything stay's in compute with server side
> > dedup.
> > >
> > > I only use client-side dedup with WAN networks or when I test and
> > > would see that the speeds are comparable, I think this would be due to
>
> > > a very low latency network but I wonder if others here have done more
> > > testing on this.
> > > I believe that when you have very heavy loads you can put more in the
> > > server in an night of backups because there is a smaller load on the
> > > server because the clients do some work but again, I only use it when
> > > sending data thru WAN connections, for those scenario's it's perfect
> > > especially in combination with 8.1 B/A client-side compression.
> > >
> > >
> > >
> > >
> > > On Fri, May 12, 2017 at 1:52 PM, Loon, Eric van (ITOPT3) - KLM <
> > > Eric-van.Loon AT klm DOT com> wrote:
> > >
> > > > Dear TSM-ers,
> > > > We are almost in production with our first directory containerpool
> > > > TSM
> >
> > > > server. We did a lot of testing with client vs server-side dedup and
>
> > > > overall client-side was faster. But not on one of our Linux clients.
> > > > This machine has a filesystem with nearly two million files, total
> > > > amount of data is 227 GB. The server backs up through 1Gb Ethernet
> > > > and
> >
> > > > with server-side dedup the backup takes 03:54. With client-side
> > > > dedup enabled, the backup runs for 12:02! The client contains 16
> > > > CPU's and average utilization is no more than 10%. No swapping
> > during a backup.
> > > > I opened a PMR, but the support offices tries to convince me that
> > > > client-side dedup is slower (twice as slow) by default. I really
> > > > have a very hard time believing this. What is your experience with
> > > > client-side dedup in combination with a large amount of smaller
> > > (approx. 1 Mb) files?
> > > > Thanks for any help in advance!
> > > > Kind regards,
> > > > Eric van Loon
> > > > Air France/KLM Storage Engineering
> > > > ********************************************************
> > > > For information, services and offers, please visit our web site:
> > > > http://www.klm.com. This e-mail and any attachment may contain
> > > > confidential and privileged material intended for the addressee
> only.
> > > > If you are not the addressee, you are notified that no part of the
> > > > e-mail or any attachment may be disclosed, copied or distributed,
> > > > and that any other action related to this e-mail or attachment is
> > > > strictly
> >
> > > > prohibited, and may be unlawful. If you have received this e-mail by
>
> > > > error, please notify the sender immediately by return e-mail, and
> > > delete this message.
> > > >
> > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries
> > > > and/or its employees shall not be liable for the incorrect or
> > > > incomplete transmission of this e-mail or any attachments, nor
> > > > responsible
> > > for any delay in receipt.
> > > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > > > Dutch
> > > > Airlines) is registered in Amstelveen, The Netherlands, with
> > > > registered number 33014286
> > > > ********************************************************
> > > >
> > > ********************************************************
> > > For information, services and offers, please visit our web site:
> > > http://www.klm.com. This e-mail and any attachment may contain
> > > confidential and privileged material intended for the addressee only.
> > > If you are not the addressee, you are notified that no part of the
> > > e-mail or any attachment may be disclosed, copied or distributed, and
> > > that any other action related to this e-mail or attachment is strictly
>
> > > prohibited, and may be unlawful. If you have received this e-mail by
> > > error, please notify the sender immediately by return e-mail, and
> > > delete this message.
> > >
> > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/ or
>
> > > its employees shall not be liable for the incorrect or incomplete
> > > transmission of this e-mail or any attachments, nor responsible for
> > > any delay in receipt.
> > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > > Dutch Airlines) is registered in Amstelveen, The Netherlands, with
> > > registered number 33014286
> > > ********************************************************
> >
>

<Prev in Thread] Current Thread [Next in Thread>

ADSM.ORG Privacy and Data Security by KimLaw, PLLC