ADSM-L

Re: Backup an NT Cluster

1999-12-31 14:49:28
Subject: Re: Backup an NT Cluster
From: "France, Don (Pace)" <don.france-eds AT EDS DOT COM>
Date: Fri, 31 Dec 1999 13:49:28 -0600
HAS ANY ONE GOTTEN THIS TO WORK?!?

I have, recently, been trying to do the same thing;  the readme files are
(mostly) clear on creating the services with dsmcutil, etc.  The problem is:
it still didn't work.  I have specified the IP-name of the cluster for the
/group: parm, and the /clusternode:yes parm, on the dsmcutil cmd for the
cluster services created on each node;  the b/a clients work fine for the
local drives on their associated nodes... it's one of the clusters that
hasn't worked (one did, one didn't,,, one did work *before* I re-established
the cluster node services specifying the /group: and /clusternode: parms;
last night neither of them worked!)

TWO separate problems, the first severe/fatal:
1.  The owning node got an unexpected schedule failure indicating the
schedule had been deleted (not true) or window had elapsed (also not
true)... this is fatal, since the cluster did not get backed up, and can
only be backed up from 00:00-06:00 as it's a cold database backup.
2.  The non-owning node should not get any failure;  in fact, its service
should be stopped when the cluster fails over to the other node, so that
failure alerts are not sent.  This is bad 'cuz it shows up as a "failed"
event, generating ops. alerts.

What's puzzling is that the messages are flip-flopped for the other group
(we have two groups:  FTP and DB) as if one node owned FTP and the other
owned DB;  regardless, FTP worked and DB failed *before* I did the fix, per
APAR IC24196, adding parms to the "dsmcutil install" command --- after doing
that "fix" I get one "failed" and one "missed").  So, it's been one step
forward and *two* steps back... negative progress!~!

We are running ADSM client 3.1.0.7 f2 (latest fixtest package), ADSM server
on AIX is 3.1.2.40... all other clients are fine.

The only thing I can think of, to do now is to stop the service on the
non-owning node --- and hope the service on the owning node will "work"
tonight... of all nights to have the problem!!!

BTW, the messages appear bass-ackwards;  the owning node should run, the
non-owning node should get the "schedule deleted.." message (or, better,
have its service stopped!)... so, will try again... and will report on
Monday (unless I get called back in).

===========================================================
Associated logs follow...
===========================================================

The owning node (since this file was on the cluster-disk F:)got the
UNEXPECTED failure - ref. ADP1-FTPgroup.dsmsched.log file...

12/30/1999 20:05:45 --- SCHEDULEREC QUERY BEGIN
12/30/1999 20:05:45 --- SCHEDULEREC QUERY END
12/30/1999 20:05:45 Next operation scheduled:
12/30/1999 20:05:45
------------------------------------------------------------
12/30/1999 20:05:45 Schedule Name:         DAILY-NT-CLUSTER
12/30/1999 20:05:45 Schedule Name:         DAILY-NT-CLUSTER
12/30/1999 20:05:45 Action:                Incremental
12/30/1999 20:05:45 Objects:
12/30/1999 20:05:45 Options:
12/30/1999 20:05:45 Server Window Start:   00:00:00 on 12/31/1999
12/30/1999 20:05:45
------------------------------------------------------------
12/30/1999 20:05:45 Command will be executed in 4 hours and 10 minutes.
12/30/1999 20:05:45 Command will be executed in 4 hours and 10 minutes.
12/31/1999 00:53:34
Executing scheduled command now.
12/31/1999 00:53:34 Node Name: ADP1-FTPGROUP
12/31/1999 00:53:34 Session established with server GARGOYLE.CSAA.COM:
AIX-RS/6000
12/31/1999 00:53:34   Server Version 3, Release 1, Level 2.40
12/31/1999 00:53:34   Server date/time: 12/31/1999 00:58:52  Last access:
12/31/1999 00:26:37

12/31/1999 00:53:34 ANS1814E Unable to start scheduled event
'DAILY-NT-CLUSTER'
12/31/1999 00:53:34 ANS1815E Either the window has elapsed or the schedule
has been deleted
12/31/1999 00:53:34 ANS1483I Schedule log pruning started.
12/31/1999 00:53:34 Schedule Log Prune: 1110 lines processed.  0 lines
pruned.
12/31/1999 00:53:34 ANS1484I Schedule log pruning finished successfully.
12/31/1999 00:53:34 Querying server for next scheduled event.
12/31/1999 00:53:34 Node Name: ADP1-FTPGROUP
12/31/1999 00:53:34 Session established with server GARGOYLE.CSAA.COM:
AIX-RS/6000
12/31/1999 00:53:34   Server Version 3, Release 1, Level 2.40
12/31/1999 00:53:34   Server date/time: 12/31/1999 00:58:52  Last access:
12/31/1999 00:58:52

12/31/1999 00:53:34 --- SCHEDULEREC QUERY BEGIN
12/31/1999 00:53:34 --- SCHEDULEREC QUERY END
12/31/1999 00:53:34 Next operation scheduled:
12/31/1999 00:53:34
------------------------------------------------------------
12/31/1999 00:53:34 Schedule Name:         DAILY-NT-CLUSTER
12/31/1999 00:53:34 Schedule Name:         DAILY-NT-CLUSTER
12/31/1999 00:53:34 Action:                Incremental
12/31/1999 00:53:34 Objects:
12/31/1999 00:53:34 Options:
12/31/1999 00:53:34 Server Window Start:   00:00:00 on 01/01/2000
12/31/1999 00:53:34
------------------------------------------------------------
12/31/1999 00:53:34 Schedule will be refreshed in 6 hours.
12/31/1999 00:53:34 Schedule will be refreshed in 6 hours.


================================================================
The non-owning node (found the ADP1-FTPgroup.dsmsched.log file on its
C-drive, since its F: drive was owned by its partner node) got the following
messages...

12/31/1999 00:07:31
Executing scheduled command now.
12/31/1999 00:07:31 Node Name: ADP1-FTPGROUP
12/31/1999 00:07:32 Session established with server GARGOYLE.CSAA.COM:
AIX-RS/6000
12/31/1999 00:07:32   Server Version 3, Release 1, Level 2.40
12/31/1999 00:07:32   Server date/time: 12/31/1999 00:12:49  Last access:
12/30/1999 20:33:36

12/31/1999 00:07:32 --- SCHEDULEREC OBJECT BEGIN DAILY-NT-CLUSTER 12/31/1999
00:00:00
12/31/1999 00:07:32 Incremental backup of volume '\\ADPCLUSTER01\F$'
12/31/1999 00:07:32 ANS1134E Drive \\adpcluster01\f$ is an invalid drive
specification
12/31/1999 00:07:32 --- SCHEDULEREC OBJECT END DAILY-NT-CLUSTER 12/31/1999
00:00:00
12/31/1999 00:07:32 --- SCHEDULEREC STATUS BEGIN DAILY-NT-CLUSTER 12/31/1999
00:00:00
12/31/1999 00:07:32 --- SCHEDULEREC STATUS END DAILY-NT-CLUSTER 12/31/1999
00:00:00
12/31/1999 00:07:32 ANS1512E Scheduled event 'DAILY-NT-CLUSTER' failed.
Return code = 4.


Don France

Technical Architect, P.A.C.E.
San Jose, CA
mailto:dfrance AT pacbell DOT net
PACE - http://www.pacepros.com
Bus-Ph:   (408) 257-3037


> -----Original Message-----
> From: Etienne Brachel [mailto:hc_dmsr AT HOTMAIL DOT COM]
> Sent: Wednesday, November 03, 1999 3:47 AM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: Backup an NT Cluster
>
>
> Hi Jorge,
>
> Make sure your default node (machinename) of the
> backup/archive client does
> not have the option clusternode=yes in it. Make different
> schedule services
> for the non-cluster volumes and for clustered volumes.
> Make sure to have a different option file for the different
> cluster volumes
> with the clusternode option is yes. Make sure that the option
> files exist on
> the cluster volume for a possible take over. The option file should be
> possible to read on both sides. So to install it correctly
> you have to take
> over the cluster volume (or group) and install actually the
> same schedule on
> the other cluster to make sure that the scheduler service exist....
>
> Its hard to explain it this way.. but I hope it rings a bell..
> The red book is not very clear in its examples.. you have to
> make sure to
> have options files on the clustered volumes and an option
> file for the local
> drive.. and for all option files.. a seperate scheduler
> service on both
> cluster nodes...
>
> Hope this helps...
>
> regards,
>
> Etienne Brachel
> Touch The Progress services b.v.
> email: e_brachel AT ttp-int DOT com
>
>
> >From: Jorge Carvalho <jorge.carvalho AT BTA DOT PT>
> >Reply-To: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> >To: ADSM-L AT VM.MARIST DOT EDU
> >Subject: Backup an NT Cluster
> >Date: Tue, 2 Nov 1999 18:15:03 -0000
> >
> >Hi all,
> >
> >I'm having some trouble setting up (errr...) TSM on an NT Cluster. I
> >followed  the instructions on ADSM on NT Cluster Redbook and
> every time I
> >try to run dsmc I get error "ANS1155E CLUSTERNODE option is
> set to YES but
> >cluster is not enabled".
> >
> >I'm sure the cluster server is installed and running. Has
> anyone seen this
> >message before? Any clues on how to get over it?
> >
> >--
> >Jorge Carvalho
> >Grupo Mundial-Confianga
> >Av.Miguel Bombarda n: 4 2:Andar 1049-058 Lisboa
> >Tel Dir: 7922537 Geral :7922200 Ext:117751
> >Portugal
> >E-mail Jorge.Carvalho AT bta DOT pt
> >
>
> ______________________________________________________
> Get Your Private, Free Email at http://www.hotmail.com
>
<Prev in Thread] Current Thread [Next in Thread>