ADSM-L

Re: [ADSM-L] Sloooow deletion of objects on Replication target server

2017-07-26 15:42:01
Subject: Re: [ADSM-L] Sloooow deletion of objects on Replication target server
From: Stefan Folkerts <stefan.folkerts AT GMAIL DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 26 Jul 2017 21:40:30 +0200
Yes, a 300GB archivelog is tiny, that won't work for anything but the
smallest of environments, a believe a medium sized server has a 2TB archive
log.
database backups take a lot of extra time when reorgs and/or (for example)
dereference processes are running on 15K database disks, the system simply
doesn't have the time on the drives to create a speedy database backup
anymore.
Database backups achieve a more consistent and lower duration time when the
database is on SSD's because there is so much performance potential that
doing multiple things no longer bothers the system as much.

It would surprise me a lot if reducing the memory in the server would fix
the problems, I've never seen anything like that with Spectrum Protect but
I guess there is a first time for everything. :-)



On Wed, Jul 26, 2017 at 4:04 PM, Zoltan Forray <zforray AT vcu DOT edu> wrote:

> Another point of interest is the archlog filesystem.  We originally had it
> at 300GB but kept constantly overflowing & crashing since the DB backups
> that trigger at 80% wouldn't finish (>5-hours) before it reached 100%.  So
> we recently increased it to 1TB.  Now, the last DBbackup has been running
> for >24-hours and I have been sitting here watching the archlog filesystem
> %used go from 80% to now 38%.  It is taking a long, long time to empty it,
> even with nothing running but the DBBackup. With nothing but the DBBackup
> (and archlog flushing) running, the load average is still >25.
>
> I really think the additional memory is killing this box.  It was never
> this slow or overloaded before!
>
> On Wed, Jul 26, 2017 at 8:26 AM, Stefan Folkerts <
> stefan.folkerts AT gmail DOT com>
> wrote:
>
> > Oh, I just now read the 16 threads correctly, I was thinking you wrote 16
> > cores!
> > 8 cores is far below specification if your running M-size blueprint
> ingest
> > figures.
> > I've seen 16 core intel servers (2016 spec xeon CPU's) go up to 70%
> > utilization so that kind of load would never work on 8 cores, but again,
> I
> > don't know how much managed data you have and what your ingest figures
> are.
> >
> >
> > On Wed, Jul 26, 2017 at 2:02 PM, Zoltan Forray <zforray AT vcu DOT edu> 
> > wrote:
> >
> > > I kinda feel the same way since my networking folks say it isn't the
> 10G
> > > links (Xymon shows peaks of 2Gb), eventhough at it's peak processing
> load
> > > it would be handling 5-TSM servers sending replications across the same
> > 10G
> > > links also used for the NFS.
> > >
> > > If the current processes ever finish (delete of 9M objects is now into
> > > 48-hours, I will let the server sit for a day-or-two to see if it
> > > improves.  I have noticed that even with the server idle (no processes
> or
> > > sessions), the CPU load-average was still higher than the 16-threads
> > > available.  I am seriously thinking about going back to the original
> 96GB
> > > of RAM since it seems a lot of this slowdown started after bumping to
> > > 192GB.
> > >
> > > On Wed, Jul 26, 2017 at 3:16 AM, Stefan Folkerts <
> > > stefan.folkerts AT gmail DOT com>
> > > wrote:
> > >
> > > > Interesting, why would NFS be the problem if the deletion of objects
> > > > doesn't really touch the storagepools?
> > > >
> > > > I would wager that a straight up dd on the system to create a large
> > file
> > > > via 10Gb/s on NFS would be blazing fast but the database backup is
> slow
> > > > because it's almost never idle, it's always behind it's intern
> > processes
> > > > such as reorgs.
> > > >
> > > > place your bets! :-)
> > > >
> > > > http://www.strawpoll.me/13536369
> > > >
> > > >
> > > > On Mon, Jul 24, 2017 at 3:55 PM, Sasa Drnjevic <
> Sasa.Drnjevic AT srce DOT hr>
> > > > wrote:
> > > >
> > > > > Not sure of course...But, I would blame NFS
> > > > >
> > > > > Did you check the negotiated speed of your NFS eth 10G ifaces?
> > > > > And that network?
> > > > >
> > > > > Regards,
> > > > >
> > > > > --
> > > > > Sasa Drnjevic
> > > > > www.srce.unizg.hr
> > > > >
> > > > >
> > > > > On 24.7.2017. 15:49, Zoltan Forray wrote:
> > > > > > 8-cores/16-threads.  It wasn't bad when it was replicating from
> > > > 4-SP/TSM
> > > > > > servers.  We had to stop all replication due to running out of
> > space
> > > > and
> > > > > > until I finish this cleanup, I have been holding off replication.
> > > So,
> > > > > the
> > > > > > deletion has been running standalone.
> > > > > >
> > > > > > I forgot to mention that DB backups are also running very long.
> > > 1.5TB
> > > > DB
> > > > > > backup runs 8+hours to NFS storage.  These are connected via 10G.
> > > > > >
> > > > > > On Mon, Jul 24, 2017 at 9:41 AM, Sasa Drnjevic <
> > > Sasa.Drnjevic AT srce DOT hr>
> > > > > > wrote:
> > > > > >
> > > > > >> On 24.7.2017. 15:25, Zoltan Forray wrote:
> > > > > >>> Due to lack of resources, we have had to stop replication on
> one
> > of
> > > > our
> > > > > >> SP
> > > > > >>> servers. The replication target server is 7.1.6.3 RHEL 7, Dell
> > T710
> > > > > with
> > > > > >>> 192GB RAM.  NFS/ISILON storage.
> > > > > >>>
> > > > > >>> After removing replication from the nodes on source server, I
> > have
> > > > been
> > > > > >>> cleaning up the replication server by deleting the filespaces
> for
> > > the
> > > > > >> nodes
> > > > > >>> we are no longer replicating.
> > > > > >>>
> > > > > >>> My issue is the delete filespaces on the replication server is
> > > taking
> > > > > >>> forever.  It took over a week to delete one filespace with
> > > 31-million
> > > > > >>> objects?
> > > > > >>
> > > > > >>
> > > > > >> That is definitely tooooo loooong :-(
> > > > > >>
> > > > > >> It would take 6-8 hrs max, in my environment even under
> "standard"
> > > > > load...
> > > > > >>
> > > > > >> How many CPU cores does it have?
> > > > > >>
> > > > > >> And how is/was it performing the role of a target repl. server
> > > > > >> performance wise?
> > > > > >>
> > > > > >> Regards,
> > > > > >>
> > > > > >> --
> > > > > >> Sasa Drnjevic
> > > > > >> www.srce.unizg.hr
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >>>
> > > > > >>> To me it is highly unusual to take this long. Your thoughts on
> > > this?
> > > > > >>>
> > > > > >>> --
> > > > > >>> *Zoltan Forray*
> > > > > >>> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > > > >>> Xymon Monitor Administrator
> > > > > >>> VMware Administrator
> > > > > >>> Virginia Commonwealth University
> > > > > >>> UCC/Office of Technology Services
> > > > > >>> www.ucc.vcu.edu
> > > > > >>> zforray AT vcu DOT edu - 804-828-4807
> > > > > >>> Don't be a phishing victim - VCU and other reputable
> > organizations
> > > > will
> > > > > >>> never use email to request that you reply with your password,
> > > social
> > > > > >>> security number or confidential personal information. For more
> > > > details
> > > > > >>> visit http://infosecurity.vcu.edu/phishing.html
> > > > > >>>
> > > > > >>
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Zoltan Forray*
> > > > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > > > > Xymon Monitor Administrator
> > > > > > VMware Administrator
> > > > > > Virginia Commonwealth University
> > > > > > UCC/Office of Technology Services
> > > > > > www.ucc.vcu.edu
> > > > > > zforray AT vcu DOT edu - 804-828-4807
> > > > > > Don't be a phishing victim - VCU and other reputable
> organizations
> > > will
> > > > > > never use email to request that you reply with your password,
> > social
> > > > > > security number or confidential personal information. For more
> > > details
> > > > > > visit http://infosecurity.vcu.edu/phishing.html
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zforray AT vcu DOT edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more details
> > > visit http://infosecurity.vcu.edu/phishing.html
> > >
> >
>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zforray AT vcu DOT edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html
>

<Prev in Thread] Current Thread [Next in Thread>

ADSM.ORG Privacy and Data Security by KimLaw, PLLC