Thomas,
Yes I have seen this also. We are using the latest version of the ADSM
Server MVS v3.12.15
Exactly the same symptoms. No error messages!!. The scheduler appears to be
hung in memory, only by recycling the adsm server does the problem go away.
I opened a problem with IBM but have not had any response yet. It appears to
be workload related. I spread out the load among other adsm servers and
have altered my client schedules so that I don't have too many client
schedules taking off within a short period. This has helped, but there's
defnitely a problem here.
Nathan
-----Original Message-----
From: Thomas Denier [SMTP:Thomas.Denier AT MAIL.TJU DOT EDU]
Sent: Friday, March 26, 1999 9:35 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Central scheduling failures
In the last few weeks my site has had two instances of the central
scheduling
mechanism failing without evident cause. We have an MVS server at
3.1.2.1.
Both of the clients involved were at 3.1.0.6. One was an AIX system
and the
other was an HP-UX 10.20 system. Both use TCP/IP communications.
Both have
'schedmode prompted' in the dsm.sys file. A 'query status' command
reports
that the server supports any scheduling mode. In each case the
server log
showed a message reported that a client event had missed its
start-up window.
When I checked the client the 'dsmc sched' process was still running
in each
case. When I checked the dsmsched.log file I found the following at
the end of
the file in each case:
Messages reporting execution of the last successful event
Messages showing the results of querying the server for the next
scheduled
event
A message reporting that the scheduler process was waiting to be
contacted by
the server
All of the messages mentioned above had time stamps within a few
seconds of
each other. In each case I stopped and restarted the scheduler
process and
subsequent events were carried out on schedule. In the HP-UX case, I
checked
the dotted decimal addresses used for client sessions before and
after the
failed event. They were the same. In the HP-UX case, I updated the
schedule,
creating a sequence of events like the following:
Successful event
Query for next event
Schedule change
Event created by schedule change (which failed)
Event reported in response to the query
I don't remember whether the AIX case involved a similar schedule
change.
Does anyone recognize this as a known problem? Failing that, does
anyone have
any suggestions for tracking down the cause?
|