Migration from AIX to Linux using Node replication

NoTe

Active Newcomer
Joined
Sep 30, 2019
Messages
7
Reaction score
1
Points
0
Hello,

We are doing a migration from AIX to Linux using node replication.
We managed to accomplish this in a DEV environment, TSM source: 6.3.4 to TSM Target: 7.1.9.

However, in the TEST environment, which has the same version, just different node names and storages, we get an error when we execute the "replicate node Node01":

errorrepnode.png

I tried to search about this error but there's nothing similar or clear.
We tried also today to enable tracing for REPL class but nothing was useful:

17:34:47.612 [8714128][output.c][7531][PutConsoleMsg]:ANR0406I Session 5395365 started for node NODE01 (ICMRM) (Tcp/Ip hostname(47330)).~
17:34:47.644 [8714128][admrepl.c][2510][admIsNodeReplica]:Entry, node=NODE01
17:34:47.644 [8714128][admrepl.c][3563][FetchExtendedAttr]:Starting, node=NODE01, attrName=REPL_STATE
17:34:47.645 [8714128][admrepl.c][3646][FetchExtendedAttr]:Return rc=0
17:34:47.645 [8714128][admrepl.c][3563][FetchExtendedAttr]:Starting, node=NODE01, attrName=REPL_MODE
17:34:47.645 [8714128][admrepl.c][3646][FetchExtendedAttr]:Return rc=0
17:34:47.645 [8714128][admrepl.c][2539][admIsNodeReplica]:Exit, rc=0, *isReplicaP=0
17:34:47.647 [8714128][smnode.c][12280][SmEndVbTxn]:Entering for txn 0:1.1867088022, txnRc 0
17:34:47.647 [8714128][smnode.c][12340][SmEndVbTxn]:Exiting, txnRc 0
17:34:47.651 [8714128][smnode.c][12280][SmEndVbTxn]:Entering for txn 0:1.1867088024, txnRc 0
17:34:47.651 [8714128][smnode.c][12340][SmEndVbTxn]:Exiting, txnRc 0
17:34:47.653 [8714128][smnode.c][12280][SmEndVbTxn]:Entering for txn 0:1.1867088026, txnRc 0
17:34:47.653 [8714128][smnode.c][12340][SmEndVbTxn]:Exiting, txnRc 0
17:34:47.667 [8714128][smnode.c][12280][SmEndVbTxn]:Entering for txn 0:1.1867088027, txnRc 0
17:34:47.667 [8714128][smnode.c][12340][SmEndVbTxn]:Exiting, txnRc 0
17:34:47.668 [8714128][output.c][7531][PutConsoleMsg]:ANR0403I Session 5395365 ended for node NODE01 (ICMRM).

Does anyone has any idea what the problem can be or where could we get more information in order to debug it?

Thanks a lot in advance!
 
Just to add, I also tried to run the command "validate replication Node01" and I get the same error message.
 
The error messages says there's no eligible filespaces for that node.

Check q node {nodename} f=d and q filespace {nodename} f=d and make sure that replication is enabled for the node and filespaces.
 
Hello,

Thanks for the quick reply.
I had already checked this and all seems to be enabled, on q node:


Replication State: Enabled
Replication Mode: Send
Backup Replication Rule: DEFAULT
Archive Replication Rule: DEFAULT
Space Management Replication Rule: DEFAULT
Client OS Name: AIX:AIX

Same for the filespaces, I tried even just issuing "replicate node Node01 /od/HAA" and get the same error message, even though the filespace is:

Node Name: Node01
Filespace Name: /od/HAA
Hexadecimal Filespace Name:
FSID: 1
Platform: CMOD
Filespace Type: API:IBM OnDemand
Is Filespace Unicode?: No
Capacity: 0 KB
Pct Util: 0.0
Last Backup Start Date/Time:
Days Since Last Backup Started:
Last Backup Completion Date/Time:
Days Since Last Backup Completed:
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:
Last Replication Start Date/Time:
Days Since Last Replication Started:
Last Replication Completion Date/Time:
Days Since Last Replication Completed:
Backup Replication Rule Name: DEFAULT
Backup Replication Rule State: Enabled
Archive Replication Rule Name: DEFAULT
Archive Replication Rule State: Enabled
Space Management Replication Rule Name: DEFAULT
Space Management Replication Rule State: Enabled

Thanks for your help
 
In case it's a defect, you could try upgrading the source server to 7.1.9 like the target. It's not a requirement that they match, but 6.3.4 is ancient and lots of things have been fixed since. If the OS/hardware doesn't support 7.1.9, I'd recommend 6.3.6.100 which is the most current 6.3 version.
 
That could be a possible option in the future indeed.
We are using AIX 7.1 as source.

However, what we find it strange, is the fact that we have the same versions in another server, AIX 7.1 to RHEL 7 as well, and there the replication worked and we never had this issue, therefore we don't think it could be really a defect, even if upgrading might solve the issue.

Do you know any other way how we could trace this error message besides with trace enable and REPL class?

Thank you
 
We are using AIX 7.1 as source.
Ok, you said 6.3.4 in your original post.

Do you know any other way how we could trace this error message besides with trace enable and REPL class?
Tracing is the easy part, analyzing the trace is where it gets tricky unless you see an error in plain English in it.
A couple of odd things about the trace. ANR2032E and ANR9999D don't show in it, that's the source of the failure.

Also, since it's there was an internal server error detected, that would indicate an issue with the code or the data stored in the DB. Personally, I'd upgrade first to rule out a software defect before spending time troubleshooting what has the potential to be a software defect.
 
Hello,

Just to let you know that the root cause of this issue was due to the fact the table REPLICATION_RULES was empty, in the DB.
Re-importing the original values from another TSM instance fixed this issue.

Thanks
 
Back
Top