ADSM-L

Re: TSM Scheduler not contacting clients

2003-10-06 12:51:56
Subject: Re: TSM Scheduler not contacting clients
From: Dave Canan <ddcanan AT ATTGLOBAL DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 6 Oct 2003 09:50:53 -0700
Mark, I'd like you to do the following to see if we can fix the problem you're having:

1. Apply the fix for APAR IY46228. (fileset bos.net.tcp.client 5.1.0.54)

        2. I would also make the following changes to your network options:

                no -o thewall=1058576
                no -o rfc1323=1
                no -o tcp_sendspace=65536
                no -o tcp_recvspace=65536
                no -o tcp_mssdflt=1448
                no -o tcp_nodelayack=1

LET ME KNOW IF THIS WORKS (Please).

At 06:15 PM 10/6/2003 +0800, you wrote:

devices.pci.bla output:

  devices.pci.23100020.diag
                            5.1.0.35    C     F    IBM PCI 10/100 Mb Ethernet
Adapter (23100020) Diagnostics


no -a output from the server.  Most of our clients are Solaris and NT.

hkgiicorep1:/root# goto hkgintsmsp1 no -a
         extendednetstats = 0
                  thewall = 524264
               sockthresh = 85
                   sb_max = 1048576
                somaxconn = 1024
      clean_partial_conns = 0
        net_malloc_police = 0
                  rto_low = 1
                 rto_high = 64
                rto_limit = 7
               rto_length = 13
          inet_stack_size = 16
              arptab_bsiz = 7
                arptab_nb = 25
               tcp_ndebug = 100
                   ifsize = 8
                 arpqsize = 12
                 ndpqsize = 50
             route_expire = 1
       send_file_duration = 300
                 fasttimo = 200
          routerevalidate = 0
         dgd_packets_lost = 3
           dgd_retry_time = 5
            dgd_ping_time = 5
              passive_dgd = 0
                  sodebug = 0
                nbc_limit = 393192
            nbc_max_cache = 131072
            nbc_min_cache = 1
                 nbc_pseg = 0
           nbc_pseg_limit = 524264
                 strmsgsz = 0
                 strctlsz = 1024
                 nstrpush = 8
                strthresh = 85
                psetimers = 20
              psebufcalls = 20
               strturncnt = 15
             pseintrstack = 12288
                lowthresh = 90
                medthresh = 95
                 psecache = 1
          subnetsarelocal = 1
                   maxttl = 255
                ipfragttl = 60
          ipsendredirects = 1
             ipforwarding = 0
                  udp_ttl = 30
                  tcp_ttl = 60
               arpt_killc = 20
            tcp_sendspace = 16384
            tcp_recvspace = 16384
            udp_sendspace = 9216
            udp_recvspace = 42080
       tcp_bad_port_limit = 0
       udp_bad_port_limit = 0
           rfc1122addrchk = 0
           nonlocsrcroute = 0
            tcp_keepintvl = 150
             tcp_keepidle = 14400
                bcastping = 0
                 udpcksum = 1
              tcp_mssdflt = 512
          icmpaddressmask = 0
             tcp_keepinit = 150
ie5_old_multicast_mapping = 0
                  rfc1323 = 0
         pmtu_default_age = 10
 pmtu_rediscover_interval = 30
        udp_pmtu_discover = 0
        tcp_pmtu_discover = 0
                ipqmaxlen = 100
       directed_broadcast = 0
        ipignoreredirects = 0
           ipsrcroutesend = 1
           ipsrcrouterecv = 0
        ipsrcrouteforward = 1
       ip6srcrouteforward = 1
               ip6_defttl = 64
                ndpt_keep = 120
           ndpt_reachable = 30
             ndpt_retrans = 1
               ndpt_probe = 5
                ndpt_down = 3
            ndp_umaxtries = 3
            ndp_mmaxtries = 3
                ip6_prune = 2
            ip6forwarding = 0
              multi_homed = 1
                 main_if6 = 0
               main_site6 = 0
              site6_index = 0
                 maxnip6q = 20
          llsleep_timeout = 3
             tcp_timewait = 1
        tcp_ephemeral_low = 32768
       tcp_ephemeral_high = 65535
        udp_ephemeral_low = 32768
       udp_ephemeral_high = 65535
                 delayack = 0
            delayackports = {}
                     sack = 0
                 use_isno = 1
              tcp_newreno = 1
          tcp_nagle_limit = 65535
                  rfc2414 = 0
          tcp_init_window = 0
                  tcp_ecn = 0
     tcp_limited_transmit = 1
        icmp6_errmsg_rate = 10
             tcp_maxburst = 0
           tcp_nodelayack = 0
             tcp_finwait2 = 1200
hkgiicorep1:/root#?


--
Mark Ferraretto
Unix Systems Administrator
Deutsche Bank Hong Kong
w: +852 2203 6362        m: +852 9558 8032        f: +852 2203 6971
mark.ferraretto AT db DOT com




ddcanan@ATTGLOBAL .NET To: ADSM-L AT VM.MARIST DOT EDU Sent by: cc: [email protected]. Subject: Re: TSM Scheduler not contacting clients EDU




04/10/2003 02:32 Please respond to ADSM-L








         Could you post the output from an "no -a" command from both your
client and server? I am particularly interested in the value of the
parameter setting of tcp_pmtu_discover. Also, I'd like to know what the
level is of your pci driver - fileset is named devices.pci.23100020.rte.


At 06:45 PM 10/3/2003 +0800, you wrote:
>My netstat shows no errors.  I'm not sure if it's this particular problem.
>
>1:root@hkgintsmsp1:/home/root # netstat -v ent0
>-------------------------------------------------------------
>ETHERNET STATISTICS (ent0) :
>Device Type: IBM 10/100/1000 Base-T Ethernet PCI Adapter (14100401)
>Hardware Address: 00:06:29:6b:5d:30
>Elapsed Time: 19 days 21 hours 1 minutes 59 seconds
>
>Transmit Statistics:                          Receive Statistics:
>--------------------                          -------------------
>Packets: 968722735                            Packets: 2738698376
>Bytes: 122708764323                           Bytes: 3883111224538
>Interrupts: 5777133                           Interrupts: 973619783
>Transmit Errors: 0                            Receive Errors: 0
>Packets Dropped: 0                            Packets Dropped: 0
>                                               Bad Packets: 0
>Max Packets on S/W Transmit Queue: 126
>S/W Transmit Queue Overflow: 0
>Current S/W+H/W Transmit Queue Length: 0
>
>Broadcast Packets: 12232                      Broadcast Packets: 6240805
>Multicast Packets: 2                          Multicast Packets: 2
>No Carrier Sense: 0                           CRC Errors: 0
>DMA Underrun: 0                               DMA Overrun: 0
>Lost CTS Errors: 0                            Alignment Errors: 0
>Max Collision Errors: 0                       No Resource Errors: 194
>Late Collision Errors: 0                      Receive Collision Errors: 0
>Deferred: 0                                   Packet Too Short Errors: 0
>SQE Test: 0                                   Packet Too Long Errors: 0
>Timeout Errors: 0 Packets Discarded by Adapter: 0
>Single Collision Count: 0                     Receiver Start Count: 0
>Multiple Collision Count: 0
>Current HW Transmit Queue Length: 0
>
>General Statistics:
>-------------------
>No mbuf Errors: 0
>Adapter Reset Count: 0
>Adapter Data Rate: 2000
>Driver Flags: Up Broadcast Running
>         Simplex AlternateAddress 64BitSupport
>         PrivateSegment DataRateSet
>
>Adapter Specific Statistics:
>----------------------------
>Additional Driver Flags: Autonegotiate
>Entries to transmit timeout routine: 0
>Firmware Level: 13.0.5
>Transmit and Receive Flow Control Status: Disabled
>Link Status: Up
>Media Speed Selected: Autonegotiation
>Media Speed Running: 1000 Mbps Full Duplex
>Packets with Transmit collisions:
>  1 collisions: 0           6 collisions: 0          11 collisions: 0
>  2 collisions: 0           7 collisions: 0          12 collisions: 0
>  3 collisions: 0           8 collisions: 0          13 collisions: 0
>  4 collisions: 0           9 collisions: 0          14 collisions: 0
>  5 collisions: 0          10 collisions: 0          15 collisions: 0
>
>--
>Mark Ferraretto
>Unix Systems Administrator
>Deutsche Bank Hong Kong
>w: +852 2203 6362        m: +852 9558 8032        f: +852 2203 6971
>mark.ferraretto AT db DOT com
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>       To:   ADSM-L AT VM.MARIST DOT EDU
>       cc:
>       bcc:
>       Subject:    Re: TSM Scheduler not contacting clients
>deehre01 AT LOUISVILLE DOT EDU
>Sent by: ADSM-L AT VM.MARIST DOT EDU
>10/02/2003 12:45 PM AST
>Please respond to ADSM-L            <font size=-1></font>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>There was a major bug in Aix 5.1 64 bit mode tcp that caused major
>slowdowns in backups.  It was apar IY36925 fixed in bos.net.tcp.client
>5.1.0.37.  The receive buffer problem occurred with that fix already in
>place.
>
>David Ehresman
>
> >>> r.post AT SARA DOT NL 10/2/2003 12:27:02 PM >>>
>On Thu, 2 Oct 2003 09:56:38 -0400
>David E Ehresman <deehre01 AT LOUISVILLE DOT EDU> wrote:
>
>Now this makes some bells ring over here. IIRC there was some bug in
>the AIX
>5.1 TCP code, which in particular would shouw on TSM servers. I don't
>know
>what it was any more, nor what the fix is (It was just smalltalk with
>some
>engeneer once). You might want to check the APAR DB....
>
>
> > I had that problem and it was a aix tcp tuning issue.  Do a
> >    netstat -v ent1 | grep "Receive Pool Buffer"
> > where ent1 is the adaptor that your tsm traffic runs on.  If your
>"No
> > Receive Pool Buffer Errors:" line is greater than zero you have the
>same
> > problem I had.  We raised our receive pool buffer size up to 2048
>and
> > that fixed our problem.
> >
> > The ethernet adaptor has to be down to change the setting.  The
>command
> > is:
> >   chdev -l ent"x" -a rxbuf_pool_size=2048
> > where x is the adaptor name.
> >
> > David Ehresman
> > University of Louisville
> >
> >
> > >>> mark.ferraretto AT DB DOT COM 10/1/2003 8:21:01 PM >>>
> > All,
> >
> > I am running TSM 5.1.6.1  on AIX 5.1ML3.  In the last few weeks it's
> > started exhibiting a rather strange problem where the scheduler
>seems
> > not to contact the clients to start the backup schedule.
> >
> > Halting TSM and restarting it seems to fix it for a few days and
>then
> > it will start again.
> >
> > Checking the actlog, I stop seeing 'Schedule prompter contacting...'
> > messages.  And, of course, the following day I see a whole stack of
> > 'Missed' backups.  Sometimes the prompter may contact one or two (out
>of
> > about 100) systems before stopping altogether.  We have 100 Nodes -
>50
> > Unix (Solaris/Linux/AIX) and 50 NT/2000 systems.  All are running
> > file-based backups.   We also have a single Notes server (NT) that's
> > backed up using TDP.  All are in prompted mode.
> >
> > I haven't been able to find any error messages in the TSM log.  Just
>an
> > absence of prompter messages like I said before.
> >
> > Can anyone help?
> >
> > Mark
> >
> >
> > --
> > Mark Ferraretto
> > Unix Systems Administrator
> > Deutsche Bank Hong Kong
> > w: +852 2203 6362        m: +852 9558 8032        f: +852 2203 6971
> > mark.ferraretto AT db DOT com
> >
> >
> > --
> >
> > This e-mail may contain confidential and/or privileged information.
>If
> > you are not the intended recipient (or have received this e-mail in
> > error) please notify the sender immediately and destroy this e-mail.
>Any
> > unauthorized copying, disclosure or distribution of the material in
>this
> > e-mail is strictly forbidden.
>
>
>--
>Met vriendelijke groeten,
>
>Remco Post
>
>SARA - Reken- en Netwerkdiensten
>http://www.sara.nl
>High Performance Computing  Tel. +31 20 592 8008    Fax. +31 20 668
>3167
>
>"I really didn't foresee the Internet. But then, neither did the
>computer
>industry. Not that that tells us very much of course - the computer
>industry
>didn't even foresee that the century was going to end." -- Douglas
>Adams
>
>
>
>
>
>
>--
>
>This e-mail may contain confidential and/or privileged information. If you
>are not the intended recipient (or have received this e-mail in error)
>please notify the sender immediately and destroy this e-mail. Any
>unauthorized copying, disclosure or distribution of the material in this
>e-mail is strictly forbidden.

Dave Canan
TSM Performance
IBM Advanced Technical Support
ddcanan AT us.ibm DOT com





--

This e-mail may contain confidential and/or privileged information. If you are not the intended recipient (or have received this e-mail in error) please notify the sender immediately and destroy this e-mail. Any unauthorized copying, disclosure or distribution of the material in this e-mail is strictly forbidden.

Dave Canan
TSM Performance
IBM Advanced Technical Support
ddcanan AT us.ibm DOT com

<Prev in Thread] Current Thread [Next in Thread>