ADSM-L

Re: ADSM Connect Agent for MS Exchange - strange tcp/ip problems

1999-05-03 13:11:26
Subject: Re: ADSM Connect Agent for MS Exchange - strange tcp/ip problems
From: Nathan King <nathan.king AT USAA DOT COM>
Date: Mon, 3 May 1999 12:11:26 -0500
Thanks Del,

Sorry for not being clear.
On the client side we do see TCP/IP connection failure message from the
Connect Agent.

What happens is that the client goes into a prolonged recvw state. The
activity log then states that the client has not responded within 900
seconds and termintates the session. At this point the Connect Agent gives
the TCP/IP connection failure message in it's log file.

Your suggestion to lower the buffers occurred to me but I only tried what I
thought was the min i.e. buffers 2,64. I'll give your suggestion a try to
see what happens.

I agree that a network problem seems likely to be the case here. Thanks for
your input.

Nathan

        -----Original Message-----
        From:   Del Hoobler [SMTP:hoobler AT US.IBM DOT COM]
        Sent:   Monday, May 03, 1999 12:07 PM
        To:     ADSM-L AT VM.MARIST DOT EDU
        Subject:        Re: ADSM Connect Agent for MS Exchange - strange
tcp/ip problems

        Nathan,

        Just a few of my ideas before calling IBM service...

        You don't really say what is happening on the Exchange
        Agent "client" side.  Is it getting a communication failures?
        Is it "hanging?"  What errors are you seeing on this side?

        ...this sounds too much like a network problem...
        The Exchange Agent is a little bit different than the BA client
        in that it uses mutiple threads and multiple buffers to keep the
        network pipe full of data (one thread is filling the
        buffers, another thread is sending the data to the ADSM server).
        One thing to try is to use the "/BUFFERS:1" option
        on the Exchange Agent backup command.  For example:

           EXCDSMC /BACKUP:IS,FULL /BUFFERS:1

        to "slow down" the speed at which the data is being
        sent to the pipe.  Obviously, this is not what you
        want long term, but it may help during diagnosis.

        Thanks,

        Del

        ----------------------------------------------------

        Del Hoobler
        IBM ADSM Agent Development
        hoobler AT us.ibm DOT com

        > We are experiencing some very strange problems which only seem to
surface
        > with the ADSM Connect Agent for MS Exchange
        >
        > We have three ADSM Servers.
        >
        > Two of which are RS/6000 S7A's running AIX level 4.3.2 and ADSM
3.1.2.20
        >
        > The other ADSM Server is an R40 running AIX level 4.2.1 and ADSM
3.1.15
        >
        > We are using the ADSM Connect Agent for MS Exchange 1.1.0.0A
(Fixtest A)
        >
        > All of the servers have ATM cards using ATM Lan emulation
software.
        >
        > We are experiencing a lot of problems with TCP/IP failures with
full backups
        > of the Exchange Servers using the ADSM Connect Agent.
        >
        > On the two S7A's when we fire up about 3 or more full backup
sessions
        > concurrently the Connect Agent sessions go into prolonged Recvw
states until
        > eventually the sessions time out and the backups fails. This is
typically
        > what happens. No errors are seen on the network or on the RS/6000.
        >
        > Occassionally we see a scenario where the AIX server becomes
'dettached'
        > from the network. - it cannot be pinged, and we have observed ATM
errors in
        > the aix error report on the Aix Servers. ADSM itself does not go
down - it
        > just times out the sessions. Eventually the AIX server recovers
from the ATM
        > errors and normal operation continues.
        > In this scenario errors have been observed on the ATM switch,
however it
        > would appear that it is the RS/6000 which is letting go of the
tcp/ip
        > connection.
        >
        > We have checked our network for problems and have come up with
nothing.
        > Futhermore this problem has only been observed with the Connect
Agent itself
        > - we have not been able to replicate this problem with standard BA
client
        > sessions. We have had almost 50 BA client session running
concurrently
        > without TCP/IP drops or errors.
        >
        > What I have also noticed is that I can get a few more Connect
Agent sessions
        > to run provided they are started up a few minutes apart. e.g. I
start up a
        > session for one Connect Agent backup, then start another a minute
after,
        > another a minute later..etc..
        >
        > On the RS/6000 R-40 I see something completely different. I cannot
get even
        > one Connect Agent backup to complete successfully. I have also
observed a
        > 'phantom' session appear on the ADSM Server during the backup
where it
        > states that a node has received 4bytes of data - the nodename is
blank.
        > Again no atm errros were found on the Aix server or on the ATM
switch.
        >
        > I'm particularly confused as to why we only see this problem with
the ADSM
        > Connect Agent. We don't have any such problems with BA Client
sessions.
        >
        > Anyone any ideas?
        >
        > Nathan
<Prev in Thread] Current Thread [Next in Thread>