ADSM-L

Re: tcp layer problem

2003-01-17 12:26:03
Subject: Re: tcp layer problem
From: Andrew Raibeck <storman AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 17 Jan 2003 10:25:19 -0700
Hi Wanda,

Good suggestion re: firewalls   :-)

About the use of FTP: while a failure in FTP would lend support to the
suggestion that this is a network problem, it is a *very* common mistake
to assume that a successful test with FTP exonerates the network. I have
worked on many dozens of these kinds of problems, and it is my experience
that in almost every instance, FTP works just fine; yet ultimately the
problem *is* somewhere in the network.

This doesn't mean that the problem *can't* lie in TSM; but statistically
speaking, the odds are very much against it. This is why I usually
recommend using a sniffer and other network diagnostic tools to see what
is happening at the network layer (especially since the connection is
being broken by the network).

One other thing that is sometimes worth trying, especially if the dsm.opt
and dsm.sys files are heavily customized, is to use a very stripped down
options file. Just use the minimal items necessary to establish a
connection with the server: COMMMETHOD, TCPSERVERADDRESS, TCPPORT,
NODENAME, and PASSWORDACCESS GENERATE (if GENERATE is used, which is
usually the case). Don't specify *any* tuning parameters. Then see if the
problem still occurs.

Regards,

Andy

Andy Raibeck
IBM Software Group
Tivoli Storage Manager Client Development
Internal Notes e-mail: Andrew Raibeck/Tucson/IBM@IBMUS
Internet e-mail: storman AT us.eyebm DOT com (change eye to i to reply)

The only dumb question is the one that goes unasked.
The command line is your friend.
"Good enough" is the enemy of excellence.




"Prather, Wanda" <Wanda.Prather AT JHUAPL DOT EDU>
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
01/17/2003 09:37
Please respond to "ADSM: Dist Stor Manager"


        To:     ADSM-L AT VM.MARIST DOT EDU
        cc:
        Subject:        Re: tcp layer problem



If by chance this client is backing up through a firewall, check the
timeout
parms on the firewall.  TSM client tends to send data in spurts, because a
lot of time is spent noodling around in directories looking for things to
back up.   Many times we have seen a firewall causing this behavior where
it
closes the session, then the client restarts the session, firewall closes
the session , cleint restarts the session, etc.  Will eventually finish,
slowly.  You have to increase the timeout values on the firewall so it
will
allow the session to idle for for a while without closing it.

If that's not it, my best tool for diagnosing network errors is a plain
FTP.
Take a sizeable file and FTP it to the TSM server several times.  If that
also has problems, it should convince your network people that the problem
is somewhere in your network...

-----Original Message-----
From: Conko, Steven [mailto:sconko AT ADT DOT COM]
Sent: Wednesday, January 15, 2003 2:39 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: tcp layer problem


ive been going round and round with tsm support on this issue... we have
an
AIX 4.3.3 client that has been upgraded several times all the way to 5.1.5
client version without success backing up to an AIX 4.3.3 tsm 4.2.2 server
over a 10/100 ethernet network (100 Full Duplex, no autonegotiate.)

there arent any network errors and the switches are all configured
correctly, and the tsm server is fine. they are beginning to insist our
problem is system/network level...

error:

ANS1809W Session is lost; initializing session reopen procedure.

appears almost constantly during backups. usually it will EVENTUALLY
finish
with a few errors but take forever to run. all our parameters appear to be
okay by tsm support standards. with the combonation of the server message
saying session was terminated they say the client is severing the
connection... that tsm is getting the message from a lower layer.

we dont have any network errors appearing anywhere else on the system for
any other applications, there are no errors in errpt or /var/adm/messages
and diags come back fine.

what else can i do to diagnose this problem?

<Prev in Thread] Current Thread [Next in Thread>