ADSM-L

Re: [ADSM-L] protect pool plus replicate node equals poor replication efficiencies

2016-09-15 15:21:26
Subject: Re: [ADSM-L] protect pool plus replicate node equals poor replication efficiencies
From: Stefan Folkerts <stefan.folkerts AT GMAIL DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 15 Sep 2016 21:17:13 +0200
There is a relatively new command called "protect stgpool" that does the
"replication" of the data part that replication node used to do.
You only do a replicate node after that to replicate the meta data.
So you basically split the replicatation of data and meta data into two
processes and the overal time it takes is shorter because protect stgpool
is faster in transferring data than the replicate node command was (and is,
you can still use that for both data and meta data).

On Thursday, 15 September 2016, Ryder, Michael S <michael_s.ryder AT roche DOT 
com>
wrote:

> The protected storage pool can be on a different server?
>
> On Thursday, September 15, 2016, Stefan Folkerts <
> stefan.folkerts AT gmail DOT com <javascript:;>>
> wrote:
>
> > Not if you run protect stgpool before you run replnode.
> > Than the protect stgpool will send the data and the replnode will
> transmit
> > only metadata of the nodes in that storagepool.
> > If you replicate nodes that have data in other storagepools yes, than it
> > will replicate that data.
> > Replicating metadata also puts data on the line of course but it's not
> > backup data, it's backup metadata.
> >
> > On Thu, Sep 15, 2016 at 6:50 PM, Ryder, Michael S <
> > michael_s.ryder AT roche DOT com <javascript:;> <javascript:;>
> > > wrote:
> >
> > > If you replnode from one server to another... it *has* to send the data
> > > that changed, no?
> > >
> > > Best regards,
> > >
> > > Mike <http://rbbuswiki.bbg.roche.com/wiki/ryderm_page:start>, x7942
> > > RMD IT Client Services
> > > <http://na.intranet.roche.com/sites/RMD/content/Departments/
> > > IT/Pages/default.aspx>
> > >
> > > On Thu, Sep 15, 2016 at 12:41 PM, Stefan Folkerts <
> > > stefan.folkerts AT gmail DOT com <javascript:;> <javascript:;>
> > > > wrote:
> > >
> > > > >Do I have this right so far?
> > > >
> > > > No, I think he is under the impression the data is send twice, it
> looks
> > > > that way a little the way Spectrum Protect reports on the replication
> > > > proces, but it's representing the data..not actually sending it, it
> is
> > > only
> > > > sending metadata of that data.
> > > >
> > > >
> > > > On Thu, Sep 15, 2016 at 6:24 PM, Ryder, Michael S <
> > > > michael_s.ryder AT roche DOT com <javascript:;> <javascript:;>
> > > > > wrote:
> > > >
> > > > > Something doesn't make sense.
> > > > >
> > > > > You run a backup - node's data is stored in a pool on server A.
> > > > >
> > > > > Then, you protect pool, and a copy of the de-duped data is sent to
> > the
> > > > > protect pool storage, also on server A.
> > > > >
> > > > > Then, you replnode, and a node is replicated to server B.  You are
> > > > > surprised to find that data is being sent from server A to server
> B.
> > > > >
> > > > > Do I have this right so far?
> > > > >
> > > > > Until you issue the replnode command, how is server B supposed to
> > know
> > > > > about the data in server A's storage pools?
> > > > >
> > > > > Don't you still need to copy the data at least once from server A
> to
> > > > server
> > > > > B?  Isn't this normal?
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Mike, x7942
> > > > > RMD IT Client Services
> > > > >
> > > > > On Thu, Sep 15, 2016 at 12:02 PM, Stefan Folkerts <
> > > > > stefan.folkerts AT gmail DOT com <javascript:;> <javascript:;>
> > > > > > wrote:
> > > > >
> > > > > > Do you have a fast Spectrum Protect database / active log?
> > > > > > We run 2.4TB of metadata per hour with replication (note, this is
> > not
> > > > > > actual data, this is metadata representing 2.4TB of data).
> > > > > > But that system has SSD's and runs in excess of 140.000 IOP/s in
> > > > Spectrum
> > > > > > Protect database benchmarks.
> > > > > > I would think this is very much database (and active log)
> > performance
> > > > > bound
> > > > > > (on both sides).
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Sep 15, 2016 at 5:50 PM, Nixon, Charles D. (David) <
> > > > > > cdnixon AT carilionclinic DOT org <javascript:;> <javascript:;>> 
> > > > > > wrote:
> > > > > >
> > > > > > > Best I can tell, it is transferring the data over the wire and
> > > > support
> > > > > > > stated as much.  We are currently using the replnode for that
> > > single
> > > > > node
> > > > > > > so it's getting the default of 10 sessions and appears to be
> > using
> > > > all
> > > > > of
> > > > > > > them, for the four hours or so that it's sending data.
> > > > > > >
> > > > > > > I don't have a good way to see server's bandwidth but network
> IO
> > > > chart
> > > > > > > implies that it's not sending a great amount of data but that
> may
> > > be
> > > > > due
> > > > > > to
> > > > > > > the 846GB over 4.5 hours.
> > > > > > >
> > > > > > > 09/15/16   10:58:08      ANR0327I Replication of node NODENAME
> > > > > completed.
> > > > > > > Files
> > > > > > >                           current: 70,341. Files replicated:
> 752
> > of
> > > > > 752.
> > > > > > > Files
> > > > > > >                           updated: 602 of 602. Files deleted:
> 692
> > > of
> > > > > 692.
> > > > > > > Amount
> > > > > > >                           replicated: 12,487 GB of 12,487 GB.
> > > Amount
> > > > > > > transferred:
> > > > > > >                           846 GB. Elapsed time: 0 Days, 4
> Hours,
> > 28
> > > > > > > Minutes.
> > > > > > >                           (SESSION: 414242, PROCESS: 539)
> > > > > > > ---------------------------------------------------
> > > > > > > David Nixon
> > > > > > > Storage Engineer II
> > > > > > > Technology Services Group
> > > > > > > Carilion Clinic
> > > > > > > 451 Kimball Ave.
> > > > > > > Roanoke, VA 24015
> > > > > > > Phone: 540-224-3903
> > > > > > > cdnixon AT carilionclinic DOT org <javascript:;> <javascript:;>
> > > > > > >
> > > > > > > Our mission: Improve the health of the communities we serve.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > ________________________________________
> > > > > > > From: ADSM: Dist Stor Manager [ADSM-L AT VM.MARIST DOT EDU
> <javascript:;>
> > <javascript:;>] on behalf of
> > > > > Stefan
> > > > > > > Folkerts [stefan.folkerts AT GMAIL DOT COM <javascript:;>
> <javascript:;>]
> > > > > > > Sent: Thursday, September 15, 2016 10:45 AM
> > > > > > > To: ADSM-L AT VM.MARIST DOT EDU <javascript:;> <javascript:;>
> > > > > > > Subject: Re: [ADSM-L] protect pool plus replicate node equals
> > poor
> > > > > > > replication efficiencies
> > > > > > >
> > > > > > > >Support confirmed that the amount of data replicated in a
> > replnode
> > > > > > command
> > > > > > > is the same, regardless of the protect pool command status.
> > > > > > >
> > > > > > > I think that this is only in the statistics, not in the actual
> > > > transfer
> > > > > > on
> > > > > > > the wire.
> > > > > > > the replnode should not transmit actual data if the data was
> send
> > > by
> > > > > the
> > > > > > > protect storagepool command.
> > > > > > > Are you running enough (but not to many) parallel processes for
> > the
> > > > > > > replicate node command so it can perform optimally?
> > > > > > >
> > > > > > > I'm using this setup for multiple customers and it worked fine
> > for
> > > us
> > > > > so
> > > > > > > far.
> > > > > > >
> > > > > > > http://imgur.com/a/mT6ux
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Sep 15, 2016 at 4:29 PM, Nixon, Charles D. (David) <
> > > > > > > cdnixon AT carilionclinic DOT org <javascript:;> <javascript:;>>
> wrote:
> > > > > > >
> > > > > > > > We opened a ticket related to long replication times in a
> > > container
> > > > > > pool
> > > > > > > > after replication takes place, and got an answer that 'we can
> > > > > recreate
> > > > > > > your
> > > > > > > > problem but it is likely working as designed' even though
> it's
> > > > > contrary
> > > > > > > to
> > > > > > > > documentation.  Any ideas would be appreciated.
> > > > > > > >
> > > > > > > > -Two TSM servers at 7.1.5
> > > > > > > > -Single client going to a single container.  Client backs up
> > > 12TB a
> > > > > > night
> > > > > > > > and after dedupe/compression, we see a 1TB change rate
> > > > > (approximately).
> > > > > > > > -Once the backup is complete, we run a protect pool.  It's
> > > expected
> > > > > > that
> > > > > > > > this process will ship 1TB to the DR site.  -Protect
> completes
> > > > > > > successfully.
> > > > > > > > -a replnode is issued against the node and TSM spends the
> next
> > 4
> > > > > hours
> > > > > > > > replicating data to the DR site
> > > > > > > >
> > > > > > > > Support confirmed that the amount of data replicated in a
> > > replnode
> > > > > > > command
> > > > > > > > is the same, regardless of the protect pool command status.
> > > > However,
> > > > > > the
> > > > > > > > documentation leads me to be that if you have already
> protected
> > > the
> > > > > > pool,
> > > > > > > > the replnode should be a metadata only transfer.
> > > > > > > >
> > > > > > > > So while we are able to transfer and complete the processes,
> it
> > > > seems
> > > > > > to
> > > > > > > > 'cost' us quite a bit in both IO and WAN usage to do so using
> > > > > > containers,
> > > > > > > > defeating the point of using containers to reduce replication
> > > > costs.
> > > > > > Any
> > > > > > > > ideas as to what is going on?
> > > > > > > >
> > > > > > > > ---------------------------------------------------
> > > > > > > > David Nixon
> > > > > > > > Storage Engineer II
> > > > > > > > Technology Services Group
> > > > > > > > Carilion Clinic
> > > > > > > > 451 Kimball Ave.
> > > > > > > > Roanoke, VA 24015
> > > > > > > > Phone: 540-224-3903
> > > > > > > > cdnixon AT carilionclinic DOT org <javascript:;> <javascript:;>
> > > > > > > >
> > > > > > > > Our mission: Improve the health of the communities we serve.
> > > > > > > >
> > > > > > > > ________________________________
> > > > > > > >
> > > > > > > > Notice: The information and attachment(s) contained in this
> > > > > > communication
> > > > > > > > are intended for the addressee only, and may be confidential
> > > and/or
> > > > > > > legally
> > > > > > > > privileged. If you have received this communication in error,
> > > > please
> > > > > > > > contact the sender immediately, and delete this communication
> > > from
> > > > > any
> > > > > > > > computer or network system. Any interception, review,
> printing,
> > > > > > copying,
> > > > > > > > re-transmission, dissemination, or other use of, or taking of
> > any
> > > > > > action
> > > > > > > > upon this information by persons or entities other than the
> > > > intended
> > > > > > > > recipient is strictly prohibited by law and may subject them
> to
> > > > > > criminal
> > > > > > > or
> > > > > > > > civil liability. Carilion Clinic shall not be liable for the
> > > > improper
> > > > > > > > and/or incomplete transmission of the information contained
> in
> > > this
> > > > > > > > communication or for any delay in its receipt.
> > > > > > > >
> > > > > > >
> > > > > > > ________________________________
> > > > > > >
> > > > > > > Notice: The information and attachment(s) contained in this
> > > > > communication
> > > > > > > are intended for the addressee only, and may be confidential
> > and/or
> > > > > > legally
> > > > > > > privileged. If you have received this communication in error,
> > > please
> > > > > > > contact the sender immediately, and delete this communication
> > from
> > > > any
> > > > > > > computer or network system. Any interception, review, printing,
> > > > > copying,
> > > > > > > re-transmission, dissemination, or other use of, or taking of
> any
> > > > > action
> > > > > > > upon this information by persons or entities other than the
> > > intended
> > > > > > > recipient is strictly prohibited by law and may subject them to
> > > > > criminal
> > > > > > or
> > > > > > > civil liability. Carilion Clinic shall not be liable for the
> > > improper
> > > > > > > and/or incomplete transmission of the information contained in
> > this
> > > > > > > communication or for any delay in its receipt.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
> --
>
> Best regards,
>
> Mike <http://rbbuswiki.bbg.roche.com/wiki/ryderm_page:start>, x7942
> RMD IT Client Services
> <http://na.intranet.roche.com/sites/RMD/content/Departments/
> IT/Pages/default.aspx>
>

<Prev in Thread] Current Thread [Next in Thread>