ADSM-L

Re: [ADSM-L] comm/idle/resource timeout values - take 2

2011-10-27 15:27:17
Subject: Re: [ADSM-L] comm/idle/resource timeout values - take 2
From: Richard Rhodes <rrhodes AT FIRSTENERGYCORP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 27 Oct 2011 15:20:41 -0400
We are:
   all tsm servers are v5.5.2
   storage agents are v5.5.1 and v5.4.1(being upgraded to v551)

We do not have any NAS/NDMP backups.

We found one possible cause of a reservation conflicts - we had one of the

TSM instances with a device class with "mount wait 0"  Every other
instance
 has "mount wait 1".  I hate to think how long it has been that way, but
no one
 noticed the excessive mounts until two days ago!  IBM indicated that tsm
 instances with different mount wait settings could cause scsi reservation
conflicts.


Thanks!

Rick




From:   Remco Post <r.post AT PLCS DOT NL>
To:     ADSM-L AT VM.MARIST DOT EDU
Date:   10/27/2011 02:50 PM
Subject:        Re: comm/idle/resource timeout values - take 2
Sent by:        "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>



Hi Richard,

which version of TSM are you running? In some version (6.3 ???) of TSM the
SCSI reservation method changed. SO if you mix various levels, you might
find yourself in trouble, of if you don't set the SCSI reservation key
correctly.

Also, there is a bug in TSM 5.5.2 and lower for NDMP where the NAS filer
might report SCSI reservation conflicts because TSM doesn't track the NDMP
session properly.

Also, I've seen SCSI reservation conflicts reported when the server2server
communication from (IIRC) the LM to the LC doesn't work. Check if you can
route commands in both directions properly...


On 27 okt. 2011, at 14:58, Richard Rhodes wrote:

> (I had  the values for commtimeout and idletimeout values backwards!
fixed
> below)
>
> Hi Everyone,
>
> In working with support on a couple issues we've realized that we have
> different
> values for commtimeout, idletimeout, and resource timeout.
>
> We have:   2 dedicated library manager instances
>           7 tsm instances for BA client file backups
>           2 tsm instances for BIG LanFree Oracle backups (tdpo/lanfree)
> (db's > 1TB)
>              (all the big lanfree nodes are in these instances
>          32 Nodes with tdpo/lanfree setups
>
> All instances share the same tape drives via the dedicated library
> managers.
>
> The dedicated library managers, TSM instances for big lanfree nodes,
> and the storage agents are all defined with the following parms:
>  commtimeout       240    (fixed)
>  idletimeout     14400    (fixed)
>  resourcetimeout    60
>
> The seven tsm instances for normal BA client backups have the following
> parms:
> (These tsm servers include the problem-child Windows nodes with millions
> of small files.)
>  commtimeout      150     (fixed)
>  idletimeout     3600     (fixed)
>  resourcetimeout   60
>
>
> IBM support indicated that ALL instances in this environment should use
> the same values
> for these parms. If they are not the same, then it can be a cause for
one
> of the problems we are fighting (scsi reservation errors).
> I'm not sure if the values above are good/bad/ugly, or, what values
> should be used.  I'm not finding many specific recommendations.
>
> Any suggestions would be greatly appreciated!
>
> Rick
>
>
> -----------------------------------------
> The information contained in this message is intended only for the
> personal and confidential use of the recipient(s) named above. If
> the reader of this message is not the intended recipient or an
> agent responsible for delivering it to the intended recipient, you
> are hereby notified that you have received this document in error
> and that any review, dissemination, distribution, or copying of
> this message is strictly prohibited. If you have received this
> communication in error, please notify us immediately, and delete
> the original message.

--
Met vriendelijke groeten/Kind Regards,

Remco Post
r.post AT plcs DOT nl
+31 6 248 21 622

<Prev in Thread] Current Thread [Next in Thread>