ADSM-L

Re: Idletimeout for no reason

1997-02-25 20:54:01
Subject: Re: Idletimeout for no reason
From: "Dwight E. Cook" <decook AT AMOCO DOT COM>
Date: Tue, 25 Feb 1997 19:54:01 -0600
Item Subject: Idletimeout for no reason
     There could be THOUSANDS of reasons...

     1) check your network routes... use "ping -R" under aix to show the
     route the ping is taking...  AIX,TCP/IP under AIX,other situations
     will allow a node with multiple fddi's connected to multiple subnets
     to act as routers if the "proper route" is unavailable or slow... and
     I'm not sure of how/when it will reset/free this "unstandard" route
     (if ever...)

     2) ADSM's TCPNODELAY (tho' I doubt it the case with a 20 min timeout)
     I was testing with my desktop machine and kept getting one of the
     timeout/sessionlost errors, turned on TCPNODELAY and it cured it...
     which told me I just had some other parameters NOT RIGHT but hey I was
     not dropping...

     3) DO THE ACCOUNTING RECORDS SHOW 0 (for no good)?  HEY ANS4017E is
     session rejected! not idle timeout! NO FAIR !

     *** I'll bet you are blowing away the end of your buffer on your
     client ! ! I'm talking MTU here... SET THE MTU below your clients...
     DID YOU CHANGE YOUR "TXNGROUPMAX" or "TXNBYTELIMIT" lately ?
     Change it back....!
     Some of the software (not adsm, talk'n bout 3rd party cheap junk)
     seems to have been written by people who couldn't read... or test
     their software properly (yes, I'm expecting flames on that statement)
     BUT say you set your fddi MTU to 4096, seems some people see that as
     setting a fddi buffer to 4096, THEY FORGOT HEADERS & TRAILERS ! 8
     bytes or someting like that... (each) so you end up (with large
     txngroups or large txnbytes) sending 8+4096+8, you just tossed 8 bytes
     of data and your trailer into gumby land.  NOW your software, on which
     ever end, is waiting, waiting, waiting, for the end of that group of
     data.......... it ain't gonna get there, sigh...
     EXAMPLE: moved a box from ADSM server to MVS server, MVS was using
     4352 MTU on the OSA... the client was using 4202... saw the same
     thing... Choked back the OSA to 4096 and everyone was happy again.
     I'll bet if you put a sniffer on your network you'll see that the
     backups go fine while little files (& bundles less than MTU) are
     running and then you'll see a bundle of MTU+header+trailer go by and
     nothing more will happen...

     BACK TO IDLE TIMEOUTS though... Under NOVELL our schedules initiate a
     NIGHTLY.NCF that does multiple LOAD DSMC INC blah, one per disk... you
     can set ADSM to unload when finished or keep the window open... if you
     are keeping the window open you will see an IDLE TIMEOUT but all is
     well... kind'a like not hanging the phone up after a call... if you
     just lay it down in X amount of time you will hear the beep beep beep
     If you would like to make a call please hang up and dial again....
     Your earlier conversation ended just fine, you just forgot to
     hangup...

     Hey, I know some of my posts get a little silly but this is hour 13
     for me...


______________________________ Reply Separator _________________________________
Subject: Idletimeout for no reason
Author:  ADSM-L at unix,sh/DD.RFC-822=ADSM-L AT VM.MARIST DOT EDU
Date:    2/25/97 4:24 PM


I am wondering if anyone else has seen this behaviour.  On a small
number of clients, we are seeing message ANS4017E on the client caused
by the IDLETIMEOUT (20 minutes) being reached on the server (MVS level
10) during backup and, in two cases, on a restore operation.  One AIX
client is fail
ing backup consistently every night.  This client is not
doing anything else at the time, and does not have an excessive number
of directories or excessively large directories.

One OS/2 client is failing the same way although after the third retry,
the backup succeeds.

A NetWare server trying to restore a directory tree containing about 200M
of files runs for several minutes, creates the directory structure, and
restores several files before apparently stopping in the middle of
restoring a file.  Q SESS shows wait time just building up until the
IDLETIMEOUT hits.  No tape mounts or messages outstanding.
Any suggestions on what to look for here would be greatly appreciated.

Thanks
Sam Sheppard
San Diego Data Processing Corp.
shs AT sddpc.sannet DOT gov
<Prev in Thread] Current Thread [Next in Thread>