[no subject]


All,

And yet another installment from me today.  We are running MS Cluster =
Server
on
our node that was the subject of my previous postings.  Here's the =
deal...
We have two servers,  Krypton, and Vulcan.  Both NT Server 4.0.  They
function
as a cluster server with our Clarion array.  Our virtual server's name =
is
DUNE.
When all is well,  Krypton will "own" the R: drive.  Vulcan will "own" =
S:,
T:,
and X:,  the drives R,S,T,and X are what comprises our cluster server, =
DUNE.
When either Krypton or Vulcan goes down,  the drives that are owned by =
that
machine will fail over to the other machine, resulting in minimal =
downtime
for
our highly valued users.  Therein lies our problem.  Since we =
effectively
have
no idea who will own what drive on any given day, we are running into
scheduling
problems with ADSM. I have included some examples of files for your =
viewing
enjoyment.

Krypton.txt =3D a snip of the dsmsched.log on Krypton.  (The one =
pertaining to
the
cluster)
Vulcan.txt =3D a snip of the dsmsched.log on Vulcan. (ditto)
Clusterk.opt =3D the options file on Krypton pertaining to the cluster.
Clusterv.opt =3D the options file on Vulcan pertaining to the cluster.
actlog.txt =3D activity log from our ADSM server during the time in =
question
pertaining to the cluster.

I have 2 scheduler services installed on Krypton and Vulcan.  One for =
the
local
machine, and one for DUNE.  Each schedule uses it's own options file =
and
schedlog.  I can run a manual incremental on either of these machines =
with
no
problem (not including the performance issues).

It is a given that one of these machines, Krypton or Vulcan, will hit =
our
ADSM
server first.  In the example I have included,  all drives were on =
Krypton
at
the time of the backup.  Vulcan hit the server first, reported that the
drives
were invalid, as they were not "owned" by Vulcan, and failed the job.  =
6
minutes
later, Krypton comes along and tries to do it's backup.  It is told =
rather
rudely that "Either the window has elapsed or the schedule has been =
deleted"
This session is well within the time allowed, and they are two of the =
first
machines to hit the server each night.  I have them start at 17:55 as
opposed to
18:00 to ensure this.  I also have the maxscheduledsessions set way =
above
what
will ever hit the server at once.  It seems that since on of the =
machines
has
hit the server, and completed already, that when the next machine tries =
to
come
in with the same node name, ADSM thinks it has already backed up for =
the
night,
and won't let it do anymore.  From what I have read, this is installed
according
to IBM's spec's for working with clusters.  Yet, this is not a solid
solution,
as I am not guaranteed a slot for each machine to backup whichever =
drives
they
own.

Does anyone successfully backup clusters?  If you do, how can I?  ANY =
ideas
or
hints would be helpful.  Thanks for listening.

Sincerely,

Sean M. Stecker
stecker.sean AT orbital-lsg DOT com

(See attached file: actlog.txt)(See attached file: clusterk.opt)(See
attached
file: Krypton.txt)(See attached file: clusterv.opt)(See attached file:
Vulcan.txt)