ANS1017E Session rejected: TCP/IP connection failure

biscuitman

ADSM.ORG Member
Joined
Jul 5, 2007
Messages
12
Reaction score
0
Points
0
Location
Liverpool - England, UK.
:) Hi - hopefully someone can help me ...
I'm a relative newbie to TSM and have encountered an intermittent problem were a server/client(s) connection fails with the following dsmerror.log msgs:

04/03/2009 20:30:17 ANS5216E Could not establish a TCP/IP connection with address '10.135.65.227:1500'. The TCP/IP error is 'Unknown error' (errno = 10055).
04/03/2009 20:30:17 ANS4039E Could not establish a session with a TSM server or client agent. The TSM return code is -50.
04/03/2009 20:30:17 ANS1017E Session rejected: TCP/IP connection failure

The client(s)/server(AIX 5.3 - TSM 5.4.0.3) are on the same LAN - no Firewall between them - the only clients we have a problem with are Windows (OK I lie - one incident was a Netware 6.5 server) - the problem has occurred mainly with IBM x3650 servers 12-14 different client machines (and also 1x IBM x345 and 1x IBM x346) - I have checked all the usual - ports 1500 / 1501 - can ping / telnet etc between client(s)/server - when we experience this issue we cannot open a cmdline session from the client(s) to the server - it all smacks of the clients dropping out of the Domain but they haven't or a network issue but that isn't the case either as other clients are being backed up at the same time - a reboot of the client resolves but naturally this isn't the best solution with production machines
The clients affected are a mixture of TSM 5234 & 5413 ..

Our total environment consists of +100 clients which are a mixture of AIX / Windows / Netware / Linux plus TDP clients for SAP R3 / Linux and Oracle. We backup on average 65-70 clients per night.

Typical Windows dsm.opt file ...
PASSWORDACCESS GENERATE
DOMAIN C:
DOMAIN D:
TCPSERVERADDRESS 10.135.xx.xxx
LANG AMENG
Nodename @@@@@@
TCPPORT 1500
TCPBUFFSIZE 32
TCPWINDOWSIZE 32
TXNByteLimit 2048
SchedLogR 7
ErrorLogR 14
SchedMode Prompted
RESOURCEUTILIZATION 5
MANAGEDSERVICES WEBCLIENT SCHEDULE

plus the usual incls/excls ...

dsmsched.log ...
04/02/2009 20:31:54 Server date/time: 04/02/2009 20:31:53 Last access: 04/02/2009 20:31:53
04/02/2009 20:31:54 --- SCHEDULEREC QUERY BEGIN
04/02/2009 20:31:54 --- SCHEDULEREC QUERY END
04/02/2009 20:31:54 Next operation scheduled:
04/02/2009 20:31:54 ------------------------------------------------------------
04/02/2009 20:31:54 Schedule Name: XX_DAILY_1
04/02/2009 20:31:54 Action: Incremental
04/02/2009 20:31:54 Objects:
04/02/2009 20:31:54 Options:
04/02/2009 20:31:54 Server Window Start: 20:30:00 on 04/03/2009
04/02/2009 20:31:54 ------------------------------------------------------------
04/02/2009 20:31:54 Scheduler has been stopped.
04/03/2009 20:30:17 Scheduler has been started by Dsmcad.
04/03/2009 20:30:17 Querying server for next scheduled event.
04/03/2009 20:30:17 Node Name: @@@@@
04/03/2009 20:30:17 ANS1017E Session rejected: TCP/IP connection failure
04/03/2009 20:30:17 Will attempt to get schedule from server again in 20 minutes.
04/03/2009 20:50:17 Querying server for next scheduled event.
04/03/2009 20:50:17 Node Name: @@@@@

This continues until client server is rebooted and the next schedule is picked up ...

Hopefully, one of you kind people have experienced something similar and can give me a heads up to resolve this issue - Thanks in advance (Ian)
 
did u try pinging the TSM server from the client server "TCPSERVERADDRESS 10.135.xx.xxx"

if u are unable to get a reply to your ping request then there is a communication problem with the TSM server.

Normally these kinda error occurs due to communication problem between the client server and tsm server i.e. either tsm server is rebooted or the communication is dropped.

try the above and let me know
 
Hi backupnrestore ...

I can ping the TSM server both by name and ip addr from the one client that I haven't rebooted yet ...

As I said in my original post there are other backups taking place at the same time ... either already processing or starting from the same schedule so guess that'll rule out a network issue as they all communicate with the same TSM server over the same LAN.

We have seen this problem over the past 6-8 months or so - we thought it had gone away - hadn't seen any for about 2 months then over the last weekend 4 clients had the issue - 3 I've rebooted which has resolved ...
 
well then i hv few questions

1. does the scheduled backup completes after long duration?
2. Can you check when was the last successful backup and how long did it take?
3. How many sessions can u see for the server in TSM monitoring console?
4. What is the data size it is trying to backup?
5. Is there sufficiant space on the client server?


If all of the above is fine, then go to C:\dsmsys folder and delete it. When u recycle the TSM services and start the manual backup the folder is be created again.

This has worked once for me.

You can try it :)

good luck:up:
 
Hi backupnrestore - answers to your questions ..

1. does the scheduled backup completes after long duration? No longer than normal
2. Can you check when was the last successful backup and how long did it take? Thu 2nd Apr Incr successful under 5 mins (normal for Incr) - then missed Fri 3rd Apr Incr and Sat 4th Apr Full - rebooted server Sun 5th Apr and the Sunday Full was successful which took approx 20-25 mins - Mon 6th Apr Incr under 5 mins as normal.
3. How many sessions can u see for the server in TSM monitoring console? For the client in question - Looks like 4 sessions started
4. What is the data size it is trying to backup? Last incremental approx 8.25mb - Weekly Full approx 29gb
5. Is there sufficiant space on the client server? Yes

Not sure what you mean regarding the C:\dsmsys folder as I couldn't find one - so guess our system isn't configured the same way as yours.
 
Back
Top