ADSM-L

Re: Problem with HSM dsmreconcile

2003-09-29 10:02:50
Subject: Re: Problem with HSM dsmreconcile
From: Richard Sims <rbs AT BU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 29 Sep 2003 10:01:05 -0400
>I have the following problem with HSM client 5.1.5.15 on AIX.
>
>I start a dsmreconcile. This command first opens a conncection to the TSM
>Server (5.1.6.2). Then it issues thje message "Reading the current
>premigration data base". This obviously is on the local disk. I can see
>local I/O activity. Nothing is done with the server. Using "q se" I can see
>the status is idle and the wait time increases. After 20 minutes the session
>is cancelled because the idle timeout is set to 20 minutes in the
>dsmserv.opt.
>
>Some time later reading the premig db finishes. Dsmreconcile then tries to
>reconnect to the session and fails. Dsmreconcile then fails with ANS1005E
>error reading socket.
>
>Increasing the idle timeout would most likely help. However, I do not want
>to increase it infinitly, because I want inactive session to be cancelled.
>As far as i know normal backup sessions can reconnect in such a case. Why
>doesn't dsmreconcile do this?
>
>Should I open a PMR or is this working as designed?

Gerhard - You'll need to determine if the amount of time taken by this premig
          db reading is typical of each execution on your system, and adjust
server timeout values to accommodate it.  Due to the time needs of the various
clients I've dealt with, I use a server COMMTimeout value of 3600 (one hour).
Unless you have noted unproductive sessions chronically taking up session slots,
I would not be concerned about the increase.  In my opinion, all *SM clients
should be designed to employ a "heartbeat" talk-back to the server every so many
minutes, to avoid timeouts where the client session is actually viable, and it
is encumbent upon the client to judge its session continuance as viable.  (I
believe that something like this is in development.)  As you have found, not all
clients are that observant of session needs.  HSM has always been the "odd duck"
in the family, so unconventional behavior is common with it.

As to actual session times: It may be that your manually-started dsmreconcile
conflicted with an automatic dsmreconcile (per RECOncileinterval value, or in
concert with dsmautomig and high threshold processing), and hence yours was
blocked for a while.  Other HSM activity might also have gotten in the way.  The
number of files in the file system and activity will influence the volume of
data in that phase.  If your dsmreconcile is not doing logging, you may want to
rig a wrapper script, as I did, to assure capturing that data, and be able to
historically see what it has been doing, when.  I can provide such a script if
you need - write to me directly.

  Richard Sims, BU

<Prev in Thread] Current Thread [Next in Thread>