ADSM-L

Re: NT Cluster issues

1999-03-09 08:19:58
Subject: Re: NT Cluster issues
From: Nathan King <nathan.king AT USAA DOT COM>
Date: Tue, 9 Mar 1999 07:19:58 -0600
We do a similar thing.
Each machine has a standard adsm service with the dsm.opt set to do the
local c an d drives.
We have also installed a second adsm service with the dsm.opt set to the
clustered drives, using the alias to refer to these drives
e.g. \\cluster\f$ \\cluster\g$

We then made the second adsm service cluster aware (our resident cluster
guru did this part). The second adsm service is set to 
manual mode on each server. It is started on one and stopped on the other.
In the event of a failover, the other adsm service will restart on the other
machine and restart the backup.

This works quite well, except that our backup does take a long time...
Although I think this is more the size of the drives and the raw speed of
ADSM MVS (sarcasm intended)
It's only other fault is that if a failover occurs once the window has
expired then the failed over service will not restart the backup.

Regards,

Nathan



        -----Original Message-----
        From:   Michael Bartl [SMTP:michael.bartl AT ZENTRALE.ADAC DOT DE]
        Sent:   Tuesday, March 09, 1999 5:43 AM
        To:     ADSM-L AT VM.MARIST DOT EDU
        Subject:        AW: NT Cluster issues

        Sean,
        we're also testing with clustering. I tried to solve the problem
this way
        (I'm sure it's not an optimum but it seems to work):

        On both machines the ADSM client is installed with the scheduler
service
        running.
        Server1 or server2 have a local c-drive and can access drives r, s
and t
        alternatively.

        On server1 I have a dsm.opt files with
        domain c: \\cluster\r$ \\cluster\s$ \\cluster\t$
        On server2 only 
        domain c:

        So both c drives are backed up and all cluster drives, too.
        When server1 has a drive, backup is a bit faster than when server2
has it.
        In testing I didn't find great differences when backing up over a
TR16
        network.
        Be sure to use a domain user for the scheduler service account.
Otherwise
        you don't get the "network drives".

        Good luck,
        regards, Michael

        -----Ursprüngliche Nachricht-----
        Von: Sean Stecker [mailto:Stecker.Sean AT ORBITAL-LSG DOT COM]
        Gesendet am: Dienstag, 9. März 1999 01:03
        An: ADSM-L AT VM.MARIST DOT EDU
        Betreff: NT Cluster issues



        All,

        And yet another installment from me today.  We are running MS
Cluster Server
        on
        our node that was the subject of my previous postings.  Here's the
deal...
        We have two servers,  Krypton, and Vulcan.  Both NT Server 4.0.
They
        function
        as a cluster server with our Clarion array.  Our virtual server's
name is
        DUNE.
        When all is well,  Krypton will "own" the R: drive.  Vulcan will
"own" S:,
        T:,
        and X:,  the drives R,S,T,and X are what comprises our cluster
server, DUNE.
        When either Krypton or Vulcan goes down,  the drives that are owned
by that
        machine will fail over to the other machine, resulting in minimal
downtime
        for
        our highly valued users.  Therein lies our problem.  Since we
effectively
        have
        no idea who will own what drive on any given day, we are running
into
        scheduling
        problems with ADSM. I have included some examples of files for your
viewing
        enjoyment.

        Krypton.txt = a snip of the dsmsched.log on Krypton.  (The one
pertaining to
        the
        cluster)
        Vulcan.txt = a snip of the dsmsched.log on Vulcan. (ditto)
        Clusterk.opt = the options file on Krypton pertaining to the
cluster.
        Clusterv.opt = the options file on Vulcan pertaining to the cluster.
        actlog.txt = activity log from our ADSM server during the time in
question
        pertaining to the cluster.

        I have 2 scheduler services installed on Krypton and Vulcan.  One
for the
        local
        machine, and one for DUNE.  Each schedule uses it's own options file
and
        schedlog.  I can run a manual incremental on either of these
machines with
        no
        problem (not including the performance issues).

        It is a given that one of these machines, Krypton or Vulcan, will
hit our
        ADSM
        server first.  In the example I have included,  all drives were on
Krypton
        at
        the time of the backup.  Vulcan hit the server first, reported that
the
        drives
        were invalid, as they were not "owned" by Vulcan, and failed the
job.  6
        minutes
        later, Krypton comes along and tries to do it's backup.  It is told
rather
        rudely that "Either the window has elapsed or the schedule has been
deleted"
        This session is well within the time allowed, and they are two of
the first
        machines to hit the server each night.  I have them start at 17:55
as
        opposed to
        18:00 to ensure this.  I also have the maxscheduledsessions set way
above
        what
        will ever hit the server at once.  It seems that since on of the
machines
        has
        hit the server, and completed already, that when the next machine
tries to
        come
        in with the same node name, ADSM thinks it has already backed up for
the
        night,
        and won't let it do anymore.  From what I have read, this is
installed
        according
        to IBM's spec's for working with clusters.  Yet, this is not a solid
        solution,
        as I am not guaranteed a slot for each machine to backup whichever
drives
        they
        own.

        Does anyone successfully backup clusters?  If you do, how can I?
ANY ideas
        or
        hints would be helpful.  Thanks for listening.

        Sincerely,

        Sean M. Stecker
        stecker.sean AT orbital-lsg DOT com

        (See attached file: actlog.txt)(See attached file: clusterk.opt)(See
        attached
        file: Krypton.txt)(See attached file: clusterv.opt)(See attached
file:
        Vulcan.txt)
<Prev in Thread] Current Thread [Next in Thread>