ADSM-L

Re: TCP Errors

1999-03-26 10:25:13
Subject: Re: TCP Errors
From: "Richard C. Dempsey" <dempsey AT KODAK DOT COM>
Date: Fri, 26 Mar 1999 10:25:13 -0500
Server is Solaris 2.6, ADSM 3.1.2.13, TCPWindowsize 512K
Client is Solaris 2.6, ADSM 3.1.0.6, TCPWindowsize is the default (32K)

I have since chatted with our network switch guru.  The client HME
ethernet interface card was forced to 100 Mb/s full duplex, but the
switch had autonegotiated its interface to 100/half.  We also observed
on the counters many, many collisions and multi-collisions (which
should not be happening on a switch).  We have forced the switch i/f
to match the host at 100/full.  In other cases, I have observed that
this dramatically improved network (and backup) throughput.

OTOH, I suppose it would not be a bad idea to have the client and
the server use the same values for TCPWindowsize, eh?

Rich

At 09:22 AM 3/26/99 -0500, you wrote:
>What is your TCP window size set to on the server and the clients? What
>clients are you tring to backup? NT AIX ???
>
>
>
>
>"Richard C. Dempsey" <dempsey AT KODAK DOT COM> on 03/26/99 08:45:08 AM
>
>Please respond to "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
>
>To:   ADSM-L AT VM.MARIST DOT EDU
>cc:    (bcc: Jim Hunt/IS/Vencor)
>Subject:  Re: TCP Errors
>
>
>
>
>Increasing COMMTimeout from 300 seconds to 2000 seconds allowed the backup
>to complete last night.  However, about 5 sessions were lost and had to be
>re-established to pull it off.  I believe that there is a network problem
>here, and I will be taking it up with our network switching guru when he
>gets in.
>
>Thanks for your help,
>Rich
>
>>Date: Thu, 25 Mar 1999 11:26:40 -0500
>>To: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
>>From: "Richard C. Dempsey" <dempsey AT kodak DOT com>
>>Subject: Re: TCP Errors
>>
>>PS: Out of band conversation with Richard Sims has pointed to the server
>>error message "ANR0481W Session 30 for node ALTS1 (SUN SOLARIS) terminated
>>- client did not respond within 300 seconds." as indicating that
>COMMTimeout
>>is set way too low.  We will test COMMT 2000 tonight.
>>
>>>Recently, a BigIP redirector was interposed between our ADSM server
>>>(3.1.2.13, Solaris 2.6) and a client (3.1.0.6, Solaris 2.6).  Since
>>>then, the client has not had a successful scheduled incremental backup.
>>>The last line in the SchedLog is typically something like:
>>>
>>>03/24/99        20:58:50 Incremental backup of volume '/export/data3'
>>>
>>>where '/export/data3' will vary, but, of course, is always a mount
>>>point (filespace).  Both dsmc and dsmadmc seem to run fine interactively
>>>from this client, but I have not tried an interactive incremental.
>>>Router ACLs prevent me from running the GUI. (I'm just a CLI kind of
>>>guy :).  I interpret this as showing that communcations on port 1500
>>>between the server and the client are unimpeded by BigIP.
>>>
>>>I have included the end of ErrorLog below.  In /usr/include/sys/errno.h,
>>>we learn that error 32 is 'Broken Pipe'.  Does anybody have any ideas
>>>what could be causing this?
>>>
>>>Thanks,
>>>Rich
>>>
>>>03/24/99        15:31:21 TcpRead(): recv(): errno =     131
>>>03/24/99        15:31:21 sessRecvVerb: Error    -50 from call to
>'readRtn'.
>>>03/24/99        20:00:42 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:00:42 sessSendVerb: Error sending Verb, rc:  -50
>>>03/24/99        20:00:42 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:00:42 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:00:42 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:00:57 ANS1810E ADSM session has been reestablished.
>>>03/24/99        20:07:58 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:07:58 sessSendVerb: Error sending Verb, rc:  -50
>>>03/24/99        20:07:58 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:07:58 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:07:58 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:08:13 ANS1810E ADSM session has been reestablished.
>>>03/24/99        20:41:49 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:41:49 sessSendVerb: Error sending Verb, rc:  -50
>>>03/24/99        20:41:49 TcpFlush: Error        32 sending data on Tcp/Ip
>>>socket        4.
>>>03/24/99        20:41:49 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:41:49 ANS1809E Session is lost; initializing session
>>>reopen
>>>procedure.
>>>03/24/99        20:42:04 ANS1810E ADSM session has been reestablished.
>>>[end of ErrorLog]
>>>
>
>
>Richard C. Dempsey                 email: dempsey AT kodak DOT com
>Public Online Services             pager: 716-975-3539
>11th Floor, Bldg 83, RL            phone: 716-477-3457
>Eastman Kodak Company
>Rochester, NY 14650-2203
>
>

Richard C. Dempsey                 email: dempsey AT kodak DOT com
Public Online Services             pager: 716-975-3539
11th Floor, Bldg 83, RL            phone: 716-477-3457
Eastman Kodak Company
Rochester, NY 14650-2203
<Prev in Thread] Current Thread [Next in Thread>