ADSM-L

Re: MAXNUMMP

2006-09-15 09:31:43
Subject: Re: MAXNUMMP
From: Robin Sharpe <Robin_Sharpe AT BERLEX DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 15 Sep 2006 09:29:42 -0400
Well, I'd say that seems to be what is causing your intermittent failures
then.  Unfortunately, there is no "magic bullet" approach to fix this
situation -- it requires cooperation of all the admins involved (TSM, DBA,
Unix, applications), and the TSM admin has the responsibility to educate
all parties about the interactions.  For example, the DBAs must be made
aware that if they set their parallelism (is that the right term?) too high
(higher than MAXNUMMP), some channels will not be able to work, and RMAN
jobs will fail.

Do NOT set MAXNUMMP higher than the number of installed drives.... that
will almost guarantee failures.  If you set it equal to the number of
installed drives, then all of those drives must be available for that node
when it wants them, or there will be failures.  It requires coordinated
scheduling.  The approach I would take is to set MAXNUMMP only as high as
that client needs to get its backup done in the time allotted.... if a
particular node MUST backup 100GB in ten minutes (as an absurd example), it
will need several drives... but if it has four hours to complete its
backup, then one drive is plenty.

Another approach, if you have it available, is to use an external scheduler
(such as Control-M) rather than the TSM scheduler.  Most enterprise class
schedulers can manage the drives as a resource pool, and will only start a
backup that needs four drives if four drives are actually available.  This
is a labor-intensive approach (initially), and it still is not fool-proof.

The approach we use, is that ALL backups go to a disk pool initially.
Nothing goes directly to tape.  Disk pools do not use the MAXNUMMP value,
so you can run as many channels and sessions as your hardware/OS/TSM can
handle.  This also eliminates the shoe-shining problem with streaming tape
drives such as LTO and DLT (or at least postpones it).   However, this does
introduce another problem, at least for TDP Oracle clients... if the disk
pool fills up, TDPO will not go to the next pool in the hierarchy like the
BA client does... it will fail.  In practice this means keeping the
migration threshold low enough on those disk pools so that there will
always be enough space available for TDPO.  Again, this requires some
careful analysis and cooperation of the TSM admin and the DBA team.

Sorry for the long-winded response, I was on a roll!  Hope it helps...

Robin Sharpe
Berlex Labs

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 09/14/2006
04:29:48 PM:

> > Is it possible that there are two or more sessions running for the TDP
> client simultaneously?
>
> Yes, absolutely.  The Oracle DBAs have observed that this happens most
> frequently when there's media wait, in which case multiple log backups
> could be running simultaneously.
>
> How do others handle this?  Does it make sense to set MAXNUMMP to the
> number of drives...or even higher?  I remember being very unhappy when I
> set the value of MAXNUMMP to the number of drives, but I can't remember
> what happened.  Maybe I was running into some other problem.
>
> Thanks again for all the ideas,
> anker

<Prev in Thread] Current Thread [Next in Thread>