ADSM-L

Re: [ADSM-L] Extra client sessions

2016-09-01 07:19:31
Subject: Re: [ADSM-L] Extra client sessions
From: "Rhodes, Richard L." <rrhodes AT FIRSTENERGYCORP DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 1 Sep 2016 11:18:48 +0000
We see this behavior.  It happens once per week or so, usually with Windows 
servers, but not exclusively.  I've seen servers with 20-30-40 sessions. It 
happens with enough regularity that I put in a script that kills all sessions 
to a server with has more than 10 sessions.  The only cause we've ever 
identified for some of these situations (and the worse occurrences of this 
situation) is firewall setup problems.  



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Thomas Denier
Sent: Wednesday, August 31, 2016 3:41 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: *EXTERNAL* Extra client sessions

We are occasionally seeing some odd behavior in our TSM environment.

We write incoming client files to sequential disk storage pools. Almost all of 
our client nodes use the default maxnummp value of 1.

When the odd behavior occurs, a number of clients will go through the following 
sequence of events:
1.The TSM server will send a request to start a backup.
2.The client will almost immediately open a TCP connection to be used as a 
producer session (a session used to obtain information from the TSM database).
3.Somewhere between tens of seconds and a few minutes later the client will 
open a TCP connection to be used as a consumer session (a session used to send 
copies of new and changed files).
4.Sometime later the client will open a third TCP connection and start using it 
as a consumer session.
5.The TSM server will report large numbers of transaction failures because it 
considers the original consumer session to be tying up the one mount point 
allowed for the node and hence has no way of storing files arriving on the new 
consumer session.

In most cases, all of the affected clients will hit step four within an 
interval of a couple of minutes.

My current theory is that step four occurs when the client system detects a 
condition that is viewed as a fatal error in the original consumer session, 
triggering the opening of a replacement consumer session. In most cases the TSM 
server never detects a problem with the original consumer session, and 
eventually terminates the session after five hours of inactivity (we have 
database backups that can legitimately go through long periods with no data 
transfer). More rarely the TSM server eventually reports that the original 
consumer session was severed.

We occasionally see cases where the replacement consumer session is in turn 
replaced by another new session, and even cases where the latter session is 
replaced by yet another session.

Our client population is a bit over half Windows, but almost all instances of 
the odd behavior involve only Windows client systems.

The affected systems are frequently split between two data centers, each with 
its own TSM server.

We have usually not found any correlation between the odd TSM behavior and 
issues with other applications. The most recent case was an exception. There 
were some e-mail delivery failures at about the same time as step four of the 
odd TSM behavior. The failures occurred when e-mail servers were unable to 
perform LDAP queries.

When we have asked our Network Operations group to check on previous 
occurrences of the odd behavior they have consistently reported that they found 
no evidence of a network problem.

Each of our TSM servers runs under zSeries Linux on a z10 BC. Each server has a 
VIPA address with two associated network interfaces on different subnets.

I would welcome any suggestions for finding the underlying cause of the odd 
behavior.

Thomas Denier,
Thomas Jefferson University
The information contained in this transmission contains privileged and 
confidential information. It is intended only for the use of the person named 
above. If you are not the intended recipient, you are hereby notified that any 
review, dissemination, distribution or duplication of this communication is 
strictly prohibited. If you are not the intended recipient, please contact the 
sender by reply email and destroy all copies of the original message.

CAUTION: Intended recipients should NOT use email communication for emergent or 
urgent health care matters.

<Prev in Thread] Current Thread [Next in Thread>