1. Please help support our sponsors by considering their products and services.
    Our sponsors enable us to maintain high-speed Internet connection and fast webservers.
    They support this free information and knowledge exchange forum service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions

Groups of missed backups

Discussion in 'Backup / Archive Discussion' started by c.j.hund, Jan 8, 2018.

  1. c.j.hund

    c.j.hund ADSM.ORG Member

    Joined:
    Jun 22, 2005
    Messages:
    244
    Likes Received:
    1
    Occupation:
    Tivoli Admin
    Hi all,

    Information about this environment:
    TSM Servers - AIX v7.1, TSM v7.1.7.100
    TSM Clients - Linux RH - 2.6.32-696.16.1.el6, TSM 7.1.4.1

    Every so often, seemingly at random intervals, large groups of my Linux clients will miss their scheduled backup. The backups might run great for two weeks straight, then we'll have a day with 60 misses. This only happens with the Linux clients. I have Windows clients in this environment as well, and they do not seem to be affected. It's not always the same group of clients, but in order to get them operating again we are forced to restart the scheduler service. Sometimes, we'll restart the scheduler service on 60 Linux clients, then the next day a different group of 60 will miss, and the clients which we restarted scheduler services on the day before run just fine.

    Some of the more important points:
    • All these Linux clients are using the scheduler service, not the CAD.
    • When the misses occur, it always seems to happen for a group of clients in the same schedule, at the same time. Often the Linux clients themselves are in the same subnet with similar IPs.
    • There's not much information in the client dsmerror.log file - all we see are messages like these:
      12/21/17 09:17:33 ANS5216E Could not establish a TCP/IP connection with address 'X.X.X.X:X'. The TCP/IP error is 'Connection timed out' (errno = 110).
      12/21/17 09:17:33 ANS9020E A session could not be established with a TSM server or client agent. The TSM return code is -50.
      12/21/17 09:17:33 ANS2106I Connection to primary TSM server XXX failed
    • We are not running out of sessions on the TSM server.
    • There doesn't appear to be anything in the TSM server's error log which would indicate a problem.
    Is this why it's always recommend to use the CAD? This problem comes up randomly, so it's hard to nail down. Something is going on, however, which is preventing a TCP/IP connection. It feels like a network issue.

    Any insights on what might be causing this would be welcomed.

    Thank you,
    C.J.
     
  2.  
  3. c.j.hund

    c.j.hund ADSM.ORG Member

    Joined:
    Jun 22, 2005
    Messages:
    244
    Likes Received:
    1
    Occupation:
    Tivoli Admin
    Here's the exact message recorded in my Linux client's error log file:


    01/07/18 01:34:13 ANS9020E A session could not be established with a TSM server or client agent. The TSM return code is -53.
    01/07/18 01:34:13 ANS2106I Connection to primary TSM server XXX failed
     
  4. RajeshR

    RajeshR ADSM.ORG Member

    Joined:
    May 11, 2016
    Messages:
    60
    Likes Received:
    2
    Client Side: on dsm.sys
    1. Check whether you have mentioned TCPSERVERADDRESS as Host or IP (Better us IP)
    2. Use TCPPORT and TCPCLIENTPORT both parameters with different ports.
    For ex :
    TCPPort 1500
    TCPCLIENTPort 1501
    TCPServeraddress 10.10.10.10
    tcpclientaddress 10.10.10.101

    Check Schedmode as well
    Check for port restrictions if any before using specific ports.

    On TSM Server side:
    1. Schedule Randomization Percentage > check this
    2. Maximum Sessions allowed & Maximum Scheduled Sessions > check this
    3. check each schedule priority and update them based on requirement.
     

Share This Page