1. Community Tip: Please Give Thanks to Those Sharing Their Knowledge.
    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.
  2. Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)
    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

ISP - Node Replication hangs

Discussion in 'TSM Operation' started by kslztc, Aug 30, 2017.

  1. kslztc

    kslztc Active Newcomer

    Joined:
    Mar 2, 2017
    Messages:
    7
    Likes Received:
    0
    Hi,

    We have a TSM server running ISP 8.1.1.0 on WS2016, containing approx 5-600 nodes - Exchang, File, SQL, etc

    Each day we replicate all the nodes filespaces, from the primary to a secondary server.
    Within the last week a nodes filespace started to stalling the TSM server - blocking our daily script to continue.

    The node is an Exchange server containing approx 4TB of data, with a 1 month retention.
    Ive queried replication for the node, and from what ive could see, the replication should have finished - all the files are replicated, but the filespace replication is still in "Incomplete" state.

    Ive attached 2 pictures of the q replication command - hopefully someone can tell me what is going wrong :)

    Any advice would be greatly appreciated!
     

    Attached Files:

  2.  
  3. inthesun

    inthesun ADSM.ORG Member

    Joined:
    Oct 15, 2014
    Messages:
    18
    Likes Received:
    2
    Location:
    Tucson
    Hi,

    As you are just seeing one node look like it fails or is skipping objects during Replicate Node, you may want to remove it from the Node Group and just run it by itself. If you post the actlog messages at the end of this node's completion of replication, we may be able to see what exactly is happening.
     
  4. marclant

    marclant ADSM.ORG Moderator

    Joined:
    Jun 16, 2006
    Messages:
    2,533
    Likes Received:
    354
    Occupation:
    Accelerated Value Specialist for Spectrum Protect
    Location:
    Canada
    Inthesun's suggestion to isolate that node is a good idea as well. Is that node your largest node? If so, how much larger when compared to the 2nd largest node? If unsure, you can use this query to see your top 5 largest nodes:
    Code:
    select node_name,sum(reporting_mb) as MB from occupancy where node_name!='' group by node_name order by sum(reporting_mb) desc fetch first 5 rows only
    Sometimes, what appears to be a hang could be performance problem and it's processing slow enough that it gives the appearance to be hung. To determine if it's hung or slow, use QUERY PROCESS and QUERY SESSION every 20 minutes for a few hours and see if the numbers for the process and session to the target server climb, if the bytes or objects change, it's not hung and you're probably looking at a performance issue.

    Is the data in a container pool or a traditional pool? If the former, you should do PROTECT STGPOOL before REPLICATE NODE. Protect is more efficient at copying the data to the target and replicate then just needs to replicate the metadata. Ignore this paragraph if you are not using container pool.
     

Share This Page