ADSM-L

Re: Authentication problems :^(

2005-12-22 13:48:39
Subject: Re: Authentication problems :^(
From: "Bell, Charles (Chip)" <Chip.Bell AT BHSALA DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 22 Dec 2005 12:48:00 -0600
Oh, I should've also said that we have had many downtimes with many
failovers (not TOO many), and I haven't seen this problem. Thanks for
the help though...  :^)

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
John Monahan
Sent: Thursday, December 22, 2005 12:31 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Authentication problems :^(

How many times have you failed over in those 2 years?  It always works
great if you never failover :-)


______________________________
John Monahan
Senior Consultant Enterprise Solutions
Computech Resources, Inc.
Office: 952-833-0930 ext 109
Cell: 952-221-6938
http://www.computechresources.com




             "Bell, Charles
             (Chip)"
             <Chip.Bell@BHSALA
To
             .COM>                     ADSM-L AT VM.MARIST DOT EDU
             Sent by: "ADSM:
cc
             Dist Stor
             Manager"
Subject
             <[email protected]         Re: Authentication problems  :^(
             .EDU>


             12/22/2005 12:21
             PM


             Please respond to
             "ADSM: Dist Stor
                 Manager"
             <[email protected]
                   .EDU>






Well, I figure they were set up properly, because they have been working
properly before this most recent failover. By the way...

TSM server, AIX 5.2, TSM v5.3.1.2
TSM client v5.3.0, W2K

I mean, if it was not done right from the start, why would it have
worked this long (2+ years)?

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
John Monahan
Sent: Thursday, December 22, 2005 12:12 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Authentication problems :^(

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 12/22/2005
11:53:11 AM:

> In a MSCS cluster, an admin of one of our higher profile client
machines
> failed over from one machine (OLALPHA) back to other (OLBRAVO) after
> BRAVO crashed this morning.
>
>
>
> Since I've been having a devil of a time with a MSCS cluster resource
> that serves as the scheduler for the cluster drive on BRAVO not coming
> up. To begin with, it posted ANS1835E, ANS1025E, ANS1570E, all of
which
> point to authentication problems. I updated the node password, issued
a
> 'q ses -optfile...', and it would authenticate fine. When I try to
bring
> the cluster resource back online, it stays up from a few seconds,
fails,
> and when I check the registry, the passwords has disappeared! What in
> the world? It has also posted ANS1029E and ANS2050E since I've been
> playing around trying to get the cluster resource to work, and also
the
> base client (to back up C/D/system state) has been issuing ANS1977E
with
> the "ccCreateTimerFile: Unable to create timer file" and "errno=13
> error: Permission denied".
>

It sounds like the services weren't setup properly from the start or the
service password somehow got out of sync.  When setting up the services
in
the cluster, it is very important to fully set them up on each node of
the
cluster and be sure they are working BEFORE setting up the service in
the
cluster manager.  I think your only solution is to remove the service
from
the cluster configuration, then remove/resetup the services on one node,
restart the service several times and make sure it works OK.  Then
failover
to the other node and repeat.  Once you are sure both work, add the
service
back in to the cluster, make sure you get the right registry key setup
to
replicate during failover.  Fail back and forth a couple times to make
sure
all is working properly.

The big drawback here is that you will need to do this during downtime
when
you can failover nodes quite a few times.  That is why it is so
important
to ensure it is done right from the start.

Every time I have seen the disappearing password in a cluster it was
because the services weren't setup right initially or fully before
configuring them in the cluster.  In one rare case, special characters
in
the node password also caused a problem and the password wouldn't
replicate
properly.  For this reason I always use only letters or numbers in
cluster
node passwords (no underscores, dashes, etc.).

-----------------------------------------
Confidentiality Notice:
The information contained in this email message is privileged and
confidential information and intended only for the use of the
individual or entity named in the address. If you are not the intended
recipient, you are hereby notified that any dissemination,
distribution, or copying of this information is strictly prohibited. If
you received this information in error, please notify the sender and
delete this information from your computer and retain no copies of any
of this information.

<Prev in Thread] Current Thread [Next in Thread>