Cluster schedule problem

TomLa

ADSM.ORG Member
Joined
Apr 10, 2007
Messages
28
Reaction score
0
Points
0
Hi
I am new at tsm.
Have problem with my two schedules in the machine.
Only one is running.
I can run backup with dsmc inc.
Start two dsmcad with different env.
q sched in dsmc looks ok.
Cant find any errors in logs.
Any ideas or test that i can do?



Here is my dsm.opt
SErvername NSC_TSM1_node
and
SErvername NSC_TSM1
here is my dsm.sys
SErvername NSC_TSM1_node
COMMMethod TCPip
TCPPort 1500
TCPServeraddress 130.236.101.104

SCHEDMODe POlling

ERRORLOGname /var/log/dsmerror.log
ERRORLOGRetention 60 S

SCHEDLOGName /var/log/dsmsched.log
SCHEDLOGRetention 14 S
PASSWORDACCESS generate
PASSWORDDIR /opt/tivoli/tsm/client/ba/password/

QUERYSCHedperiod 1
MANAGEDServices Schedule

INCLEXCL /opt/tivoli/tsm/client/ba/bin/include_exclude
RESOURceutilization 5
TCPWindowsize 256
***DISKBuffsize 256***
NODEname lxserv72.smhi.se

SErvername NSC_TSM1
COMMMethod TCPip
TCPPort 1500
TCPServeraddress 130.236.101.104

SCHEDMODe POlling
SCHEDLOGName /data/arkiv11/nsc/tivoli/tsm/log/dsmsched.log
SCHEDLOGRetention 14 S

ERRORLOGname /data/arkiv11/nsc/tivoli/tsm/log/dsmerror.log
ERRORLOGRetention 60 S

PASSWORDACCESS generate
PASSWORDDIR /data/arkiv11/nsc/tivoli/tsm/password/

QUERYSCHedperiod 1
MANAGEDServices Schedule

domain /data/arkivu11 /data/arkiv11 /data/arkiv12 /data/arkiv13 /data/arkiv14
RESOURceutilization 5
TCPWindowsize 256
***DISKBuffsize 256***
NODEname fs6.smhi.se



regards
tom
 
Can you be more specific like:

- What OS are you running your nodes on?
- You mentioned "cluster" please elaborate further.
 
Ok
two linux redhat es4 with san disk.
Sandisk switch betwine two nodes and a ipnummber for clustername.
Local name lxserv72 and ipalias name is fs6.
The machine have two ipnumber.

/tom
 
OK. I can't really make heads or tails from the above dsm.opt and dsm.sys listing.

Let's give it a go:

1. This is a two-node Linux cluster at active-passive configuration, right?
2. From what I summarize, each node has two schedulers, is this correct? Is one scheduler for the local file system and the other for the shared environment?
3. The two IP addresses of each machine, are they (IP 1) for the LAN, and (IP 2) for the shared resource communication?

If all of the above assumptions are correct , and following the concept of a clustered environment configuration, the active node backs up all of the shared resources while the passive node stays idle. Is this what you are trying to achieve? If so, the shared-resource backup scheduler should not run while the node is passive, and if it is the active one, both schedulers should run.

I know this is very vague - could you outline per node what had been defined, detail the dsm.opt/sys per node clearly if you can.
 
You may also take a look at the proxynode functionality. This made handling the schedulers in a cluster a lot easier - especially if you don't want to fumble around with start/stop scripts for schedulers and their possible exceptions in failover situations.

PJ
 
Hi
Thanks for the response.
Moon-boddy you got it right.
It's a failover cluster with two scheduler on the active node.
When i start two dmscad, i only got backup on the local disk
not on the shared data.
Last night i just start the dmscad for the shared data.
And i got backup on the shared data, not on the local disk.
I dont now what proxynode is , so i have to lookup that.

for clearly litel more.
Node fs6.smhi.se is the alias node.
Node lxserv72.smhi.se is the name for the local machine.
I use servername NSC_TSM for the shared data.
I use servername NSC_TSM_node for the local machine.
I start two dsmcad with diffrent DSM_CONFIG env.

/tom
 
Tom,

I have similar experience when starting two instance of dsmcad. Starting the second would bum out the two TSM schedulers and the started dsmcad process.

From what I see, it looks like you have two - I hope I am reading it right - dsm.sys files: one for the local, the other for the shared.

Have you tried specifying the HTTP port for the dsm agent? By the default it is 1581 (the local system). The second instance should be 1582 (the shared resource). This change has to be spell out in the dsm.sys files.
 
Last edited:
Back
Top