Greetings admin brothers and sisters,
I've had an issue with one of my clients for a couple of weeks that I'm unable
to correct. This is a Solaris client, version 5.2.2. The server is AIX,
version 5.1.6. The client schedule starts with no problems, and sometimes a
little bit of data is sent to tape, but invariably communication with the
server gets hosed and I start seeing the following messages in the client's
error log:
*****
12/03/04 12:33:22 ANS1005E TCP/IP read error on socket = 5, errno = 131,
reason : 'Connection reset by peer'.
12/03/04 12:33:22 ANS1809W Session is lost; initializing session reopen
procedure.
12/03/04 12:33:38 ANS1810E TSM session has been reestablished.
*****
Here's what I'll see in the server activity log:
*****
12/03/04 14:22:36 ANR0480W Session 1073 for node PRDWEB2 (SUN SOLARIS)
terminated - connection with client severed.
*****
Oddly enough, I can run the backup from the command doing a "dsmc i" and it
completes without a problem. I found some good info at adsm.org on this
particular issue, and it looks like some folks were able to resolve similar
problems by reinstalling the client's software ... which I've tried to no
avail. I've talked to IBM, and they were zero help. I've talked with Sun to
verify that the interface on this Solaris box is set up correctly (which I knew
it was already). I've checked the settings on the switch port this client
plugs into. I've even pulled a new fiber cable from the client to the switch.
And the problem persists!
The problem started for no particular reason a couple of weeks ago. It's very
strange, too, because this Solaris server is clustered, and it's sister server
(which is set up exactly the same way) has no problems. Everything in the
dsm.sys and dsm.opt files of these two clients is exactly the same.
Has anyone come across this before?
Sincere thanks,
Chris Hund
Unix/Tivoli Admin
Benesight
|