ADSM-L

Re: URGENT:ANS9455E dsmwatchd: Unable to join the local failover group with rc=3!

2002-12-27 08:35:11
Subject: Re: URGENT:ANS9455E dsmwatchd: Unable to join the local failover group with rc=3!
From: "Cook, Dwight E" <DWIGHT.E.COOK AT SAIC DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 27 Dec 2002 05:29:35 -0800
Hummmmmm.....
Might help to provide some more configuration information.
I'm getting the picture of SP nodes, two set up in a ~high availability~
configuration ???

For example purposes, lets say there are 3 boxes, box1, box2, box3
box1's hostname is "TSM"
        it has only one schedule (or does it have more?) with its tsm server
(which is itself) 
        it always gets a "Missed" for its scheduled event
box2 always has "Completed" event(s)
box3 always has "Completed" event(s)

Is box2 or box3 a failover node of box1 ?  

I'm not familiar with "High Availability" configurations but it seems that
maybe the TSM Software either thinks you are running HA but you're not OR
the TSM Software is trying to link up to the failover node and can't.

Might be that TSM is trying to tell you something elsewhere isn't quite
right...

AND are you running TSM's HSM on BOX1 ? ? ?  (message hints of HSM running)
Just my own thoughts but IF you have a node that is a tsm client of itself,
it IS NOT WISE to have HSM active on it !
What if HSM migrates your DEVCONFIG file ? ? ?  or your DSMSERV.OPT file ? ?
?  just to name a couple...
        (mistakes are possible when configuring HSM and someone somewhere
might include something you don't want included)

I think this problem might be out of my ballpark but I'll always offer my 2
cents worth...

Dwight


-----Original Message-----
From: rachida elouaraini [mailto:rachida.elouaraini AT CARAMAIL DOT COM]
Sent: Friday, December 27, 2002 5:59 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: URGENT:ANS9455E dsmwatchd: Unable to join the local failover
group with rc=3!


Hi all,
Your help is very needed,
The status of the "CLIENT.BACKUP" schedule is Missed ONLY for ONE client
TSM,
that is the server TSM (the status for the others clients TSM is completed).
I have others schedules related to this node (TSM server), the status is
always
"MISSED".
I have done what Dwight told me (verify that the scheduler is running,
stopping
it and restarting it) but in vain.
The following message is written EVERY 6 minuts in the dsmerror.log file :

ANS9455E dsmwatchd: Unable to join the local failover group with rc=3!

I find again messages like :

12/26/02   20:23:08 TcpRead(): recv(): errno = 73
12/26/02   20:23:08 sessRecvVerb: Error -50 from call to 'readRtn'.
12/26/02   20:23:08 ANS1809E Session is lost; initializing session reopen
procedure.
12/26/02   20:23:09 ANS1809E Session is lost; initializing session reopen
procedure.
12/26/02   20:23:23 ANS1810E TSM session has been reestablished.
12/27/02   09:51:47 ANS9433E dsmmigfs: dm_send_msg failed with errno 22.
12/27/02   09:51:47 ANS9402E dsmmigfs: Cannot notify dsmwatchd to recover
HSM
operations on a failure node.
12/27/02   09:51:47 ANS9425E dsmmigfs: It was not possible to notify the
dsmwatchd in order to distribute a message within the failover group. The
data
of the current operation may get lost.

Waht these messages mean?
Any help is very appreciated.


______________________________________________________
Boîte aux lettres - Caramail - http://www.caramail.com

<Prev in Thread] Current Thread [Next in Thread>