[no subject]


All,

And yet another installment from me today.  We are running MS Cluster Server on
our node that was the subject of my previous postings.  Here's the deal...
We have two servers,  Krypton, and Vulcan.  Both NT Server 4.0.  They function
as a cluster server with our Clarion array.  Our virtual server's name is DUNE.
When all is well,  Krypton will "own" the R: drive.  Vulcan will "own" S:, T:,
and X:,  the drives R,S,T,and X are what comprises our cluster server, DUNE.
When either Krypton or Vulcan goes down,  the drives that are owned by that
machine will fail over to the other machine, resulting in minimal downtime for
our highly valued users.  Therein lies our problem.  Since we effectively have
no idea who will own what drive on any given day, we are running into scheduling
problems with ADSM. I have included some examples of files for your viewing
enjoyment.

Krypton.txt = a snip of the dsmsched.log on Krypton.  (The one pertaining to the
cluster)
Vulcan.txt = a snip of the dsmsched.log on Vulcan. (ditto)
Clusterk.opt = the options file on Krypton pertaining to the cluster.
Clusterv.opt = the options file on Vulcan pertaining to the cluster.
actlog.txt = activity log from our ADSM server during the time in question
pertaining to the cluster.

I have 2 scheduler services installed on Krypton and Vulcan.  One for the local
machine, and one for DUNE.  Each schedule uses it's own options file and
schedlog.  I can run a manual incremental on either of these machines with no
problem (not including the performance issues).

It is a given that one of these machines, Krypton or Vulcan, will hit our ADSM
server first.  In the example I have included,  all drives were on Krypton at
the time of the backup.  Vulcan hit the server first, reported that the drives
were invalid, as they were not "owned" by Vulcan, and failed the job.  6 minutes
later, Krypton comes along and tries to do it's backup.  It is told rather
rudely that "Either the window has elapsed or the schedule has been deleted"
This session is well within the time allowed, and they are two of the first
machines to hit the server each night.  I have them start at 17:55 as opposed to
18:00 to ensure this.  I also have the maxscheduledsessions set way above what
will ever hit the server at once.  It seems that since on of the machines has
hit the server, and completed already, that when the next machine tries to come
in with the same node name, ADSM thinks it has already backed up for the night,
and won't let it do anymore.  From what I have read, this is installed according
to IBM's spec's for working with clusters.  Yet, this is not a solid solution,
as I am not guaranteed a slot for each machine to backup whichever drives they
own.

Does anyone successfully backup clusters?  If you do, how can I?  ANY ideas or
hints would be helpful.  Thanks for listening.

Sincerely,

Sean M. Stecker
stecker.sean AT orbital-lsg DOT com

(See attached file: actlog.txt)(See attached file: clusterk.opt)(See attached
file: Krypton.txt)(See attached file: clusterv.opt)(See attached file:
Vulcan.txt)