ADSM-L

Peculiar issue with ADSM Client -> TSM Client upgrade on WinNT 4.0

2001-09-26 19:35:44
Subject: Peculiar issue with ADSM Client -> TSM Client upgrade on WinNT 4.0
From: "Kent J. Monthei" <Kent_J_Monthei AT SBPHRD DOT COM>
Date: Wed, 26 Sep 2001 19:31:56 -0400
We use the B/A Client to backup a list of ~300 files.  The list of files is
maintained in a separate text file named 'textfile.in'.  Each line of the
text file is of the form 'sel <filename> -optfile=<optfilename>', i.e. 1
'select' subcommand per line.  The schedule prompter starts a script that
simply does a 'dsmc < textfile.in'.  This is the first of several scheduled
nightly backups of that node.  The next schedule starts 30 minutes later.

We just upgraded to the TSM 4.1.2.12 Client (WinNT 4.0).  Our servers are
Sun Solaris 2.6, running TSM 4.1.2 Server.

With the ADSM v3.1.0.7 B/A Client, this worked fine, starting a single TSM
Server session, processing all the subcommands in 'textfile.in' and
reporting all the results to the *SM Server - 1 session, 300 files,
consistently averaging less than 20 minutes start to finish.  This was very
reliable & rarely failed.

With the new TSM v4.1.2.12 B/A Client, we're observing the following
problematic behavior:
     - the schedule prompter immediately starts 3 TSM Server sessions
     - 2nd session starts processing subcommands from 'textfile.in'
     - 3rd session immediately ends
     - after completing each subcommand, 2nd session spawns a
short/separate TSM Server session that immediately ends
           (300 files = 300 subcommands = 300 additional Server sessions,
each about 30 seconds)
     - after 30 minutes (our idle timeout), 1st session terminates with
ANR0482W
     - 2nd session continues processing subcommands
     - when 2nd session finishes processing all subcommands from
textfile.in, it terminates normally
     - IF/F 1st session terminated with ANR0482W, server issues error
message ANR2576W and 'q event' reports 'failed',
       otherwise 1st session ends normally when 2nd session does and 'q
event' reports 'completed'
     - IF/F server issues error message ANR2576W, all subsequent TSM Server
schedules for this node report 'missed'
       and the TSM Server Activity Log contains an ANR2716E

The same scenario is occurring randomly/intermittently on multiple clients
in different subnets on each of 4 different TSM servers, in 2 US and 2 UK
locations.  All were recently upgraded to TSM B/A Client, and also received
a Winnt TCP/IP hotfix.  We have ruled out network- and locality-related
issues because of the mix of clients, servers, schedules/times and
locations.

The key questions are:
    - why the 300 spawned TSM Server sessions, 1 per subcommand/file?
(this is the primary reason the timeout now gets triggered)
    - can that be avoided by, for example, setting 'resourceutilization' to
1?  (I dug far enough to confirm it defaults to 2)
    - when the ANR0482W occurs, why does the ANR2576W occur, and why do all
other schedules fail with ANR2716E?
      (this hints at a deeper problem, either in TSM B/A client or in Winnt
tcp/ip)
    - the ANR2716E consistently reports 1 of 4 specific tcp port numbers
1035, 1501, 1757 or 2013 - are these port numbers significant?


-rsvp with any thoughts or recommendations for resolving this, thanks!
Kent Monthei
Kent Monthei
GlaxoSmithKline R&D
<Prev in Thread] Current Thread [Next in Thread>
  • Peculiar issue with ADSM Client -> TSM Client upgrade on WinNT 4.0, Kent J. Monthei <=