c.j.hund
ADSM.ORG Senior Member
Hi all,
Information about this environment:
TSM Servers - AIX v7.1, TSM v7.1.7.100
TSM Clients - Linux RH - 2.6.32-696.16.1.el6, TSM 7.1.4.1
Every so often, seemingly at random intervals, large groups of my Linux clients will miss their scheduled backup. The backups might run great for two weeks straight, then we'll have a day with 60 misses. This only happens with the Linux clients. I have Windows clients in this environment as well, and they do not seem to be affected. It's not always the same group of clients, but in order to get them operating again we are forced to restart the scheduler service. Sometimes, we'll restart the scheduler service on 60 Linux clients, then the next day a different group of 60 will miss, and the clients which we restarted scheduler services on the day before run just fine.
Some of the more important points:
Any insights on what might be causing this would be welcomed.
Thank you,
C.J.
Information about this environment:
TSM Servers - AIX v7.1, TSM v7.1.7.100
TSM Clients - Linux RH - 2.6.32-696.16.1.el6, TSM 7.1.4.1
Every so often, seemingly at random intervals, large groups of my Linux clients will miss their scheduled backup. The backups might run great for two weeks straight, then we'll have a day with 60 misses. This only happens with the Linux clients. I have Windows clients in this environment as well, and they do not seem to be affected. It's not always the same group of clients, but in order to get them operating again we are forced to restart the scheduler service. Sometimes, we'll restart the scheduler service on 60 Linux clients, then the next day a different group of 60 will miss, and the clients which we restarted scheduler services on the day before run just fine.
Some of the more important points:
- All these Linux clients are using the scheduler service, not the CAD.
- When the misses occur, it always seems to happen for a group of clients in the same schedule, at the same time. Often the Linux clients themselves are in the same subnet with similar IPs.
- There's not much information in the client dsmerror.log file - all we see are messages like these:
12/21/17 09:17:33 ANS5216E Could not establish a TCP/IP connection with address 'X.X.X.X:X'. The TCP/IP error is 'Connection timed out' (errno = 110).
12/21/17 09:17:33 ANS9020E A session could not be established with a TSM server or client agent. The TSM return code is -50.
12/21/17 09:17:33 ANS2106I Connection to primary TSM server XXX failed - We are not running out of sessions on the TSM server.
- There doesn't appear to be anything in the TSM server's error log which would indicate a problem.
Any insights on what might be causing this would be welcomed.
Thank you,
C.J.