Veritas-bu

[Veritas-bu] NDMP backup delays - solved!

2002-09-10 09:25:52
Subject: [Veritas-bu] NDMP backup delays - solved!
From: jlkennedy AT amcc DOT com (Jeff Kennedy)
Date: Tue, 10 Sep 2002 06:25:52 -0700
Yep, this has been a serious problem for me in the past, sorry I didn't
catch it in your original post.  One weekend I found my filer backups
were just hanging there for 30+ hours.  I ran iostat on the master
(Solaris) and found that the internal drive was 100% busy.  It took me a
bit to track it down but it was /usr/openv/netbackup/BPFSMAP_TMPDIR.  In
there is where the inode maps are created and then checked.  Once I
found the problem I linked that dir to a raid 1+0 array and they dropped
from 30+ hours to just under 16.  With 4.5 I have seen it drop even
more.

~JK

Moshe Linzer wrote:
> 
> You guys are right.  I suspected the catalog processing, but couln't
> be sure.  An email from Karen Shoenbauer of Veritas clued me in to
> checking the bptm logs.  There I saw that after the backup completed
> for the first volume,  NBU kicks off ndmp_bpfsmap_create, which runs
> for 17 hours!  Only after this completes does the backup of the next
> volume begin.
> Apparantly 4.5 eliminates this catalog process (as Steve mentions), so
> it should speed up our total backup time significantly!
> 
> As far as the dump filesystem walking goes, this has been improved
> with subsequent releases of Netapp software, and is now much quicker
> than it was.
> 
> Moshe
> 
> Steve Kappel wrote:
> 
> > There are two things going on here.  One is the catalog post-
> > processing that NetBackup NDMP must do to convert the inode-based
> > information provided by the NetApp into the native path-based
> > catalog.  This overhead depends on the number of objects in
> > the backup.  Starting with NetBackup 4.5 there is no longer any
> > catalog post-processing so this overhead is eliminated.
> > Secondly, dump walks the filesystem before it starts sending
> > backup data.  A quick google search comes up with this
> > short overview of dump:
> > http://www.usenix.org/publications/library/proceedings/osdi99/full_papers/hutchi
> > nson/hutchinson_html/node6.html
> > Jeff Kennedy wrote:
> >
> >> I believe the problem is *before* the dump even begins.  I have
> >> seen
> >> this problem on volumes that have a tremendous number of small
> >> files or
> >> are terribly fragmented.  But I've never seen the problem it just
> >> start
> >> out of the blue, it's usually a gradual process that gets worse
> >> with
> >> time.
> >> ~JK
> >> John D Stephens wrote:
> >>
> >> > Moshe -
> >> > I have seen huge delays between volumes as well.  What is going
> >> > on is this.  After NBU starts the NDMP backup, it steps out
> >> > of the way and idle until the filer signals NBU that the dump
> >> > has completed.  Then the filer sends all the meta data to NBU
> >> > and NBU then catalogs that data.  Depending on the size of your
> >> > backup and the amount of files, it takes a little while for NBU
> >> > to chug through the catalogs.  Hopefully, 4.5 is faster at this.
> >> > Check your iostat -x and look for the activity on your catalogs
> >> > after the filer finishes sending the NDMP job, but NBU still
> >> > shows it to be active.  The only way around this is to use
> >> > multiple streams.  Be carefull, multiple streaming does use the
> >> > filer's CPU more than normal.
> >> > Hope this helps.
> >> > John
> >> > Moshe Linzer wrote:
> >> >
> >> >>  Lately we have started seeing delays between volumes when
> >> >>  backing up our
> >> >>  Netapp filer.  I have a single AIT-2 tape attached to my F760,
> >> >>  and I run
> >> >>  a full backup of the filer, which contains 3 volumes.  The
> >> >>  performance
> >> >>  on the first volume is as expected, but then there is a delay
> >> >>  of
> >> >>  sometimes up to 12 hours before the second volume begins to be
> >> >>  backed
> >> >>  up!  The same thing happens before the third volume.  volume
> >> >>  sizes are
> >> >>  between 375-480GB.  I think this began after we applied some
> >> >>  patches
> >> >>  lately, but the patch descriptions don't seem to have any
> >> >>  bearing on
> >> >>  this problem.  Had anyone seen this type of behaviour?
> >> >>  Thanks,
> >> >>  Moshe
> >> >>  _______________________________________________
> >> >>  Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> >> >>  http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >> >>
> >> > --
> >> >   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
> >> >  + John D Stephens    ITS Design Systems   +
> >> > + Texas Instruments  12500 TI BLVD, Dallas  +
> >> >  + jstephens AT ti DOT com     214-480-6229       +
> >> >   +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
> >> > _______________________________________________
> >> > Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> >> > http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >> >
> >> --
> >> =====================
> >> Jeff Kennedy
> >> Unix Administrator
> >> AMCC
> >> jlkennedy AT amcc DOT com
> >> _______________________________________________
> >> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> >> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> >>
> > __________________________________________________________________________
> > Steve Kappel                       steve.kappel AT veritas DOT com
> > VERITAS Software (Engineering)
> >

-- 
=====================
Jeff Kennedy
Unix Administrator
AMCC
jlkennedy AT amcc DOT com

<Prev in Thread] Current Thread [Next in Thread>