TDP for SQL server backups hanging

Spree

ADSM.ORG Member
Joined
Jan 17, 2012
Messages
27
Reaction score
0
Points
0
PREDATAR Control23

MS SQL Server: 2016
Windows: 2012 R2
TDP for SQL / FCM: 7.1.6.5

I have an environment with 2 x servers on SQL Server 2014 and another 2 x servers on SQL Server 2016.
All are on Windows 2012 R2 + TDP for SQL 7.1.6.5.

Got a problem that on 1 of the clients (running SQL 2016), it often hangs during backup (backups run daily, it hangs 1-2 times every week). When it hangs, the only way to get it back to work is to stop TSM server to let the job dies itself, the session could not be cancelled on TSM server level, even killing the job on client, the session is still on TSM server, it still exists unless I shut down TSM server service.

If I don't kill the job, it could hang for days without dying. At first I wonder if it could be tape problems, but doesn't seem so, it could happen on different tapes, and there is no hardware problem reported. Wondering if anybody has seen something similar or any suggestion. Tried to google a bit but no luck.
 
PREDATAR Control23

Hi,

Do you get a clue from error logs and/or active logs? If you do a netstat -an at server and client, do you see a tcpip queue? Can you analyze a tcp session when the session hangs? Just look for possible error messages there.
 
PREDATAR Control23

It just hanged there without any returns.
 
PREDATAR Control23

Hi,

Not a lot of info for us to work on.

Server version? Lates patch may help if you are far behind. It is at 8.1.9.200 currently

I have seen issues where TcpZeroWindows was the symptom, but the error was deep inside SP server. Other symptoms was tdpsqlc printing out 'Waiting for server ...........'
 
PREDATAR Control23

TSM server is 7.1.7.400, but I think it should be less likely to be related to TSM server?

I have totally 4 nodes with TDP for SQL clients (TDP for SQL & TSM client version 7.1.6.5, SQL server is 2 x 2014 and 2 x 2016), and another 5 nodes with TSM for VE (Hyper-V) 8.1.8.0. Only one of those SQL 2016 clients is having problems.

The problematic one is on MS SQL Server 13.0.5292.0.

I am trying to google a bit but doesn't seem to find any clue, though I can probably try to upgrade the problematic client to the latest 8.1 version (8.1.9 AFAIK), but I would want a reason if possible, as if after upgrade, it still has problems I would probably be in trouble.
 
PREDATAR Control23

Just one piece of info, when you cancel a session on the server, it's a graceful process. It will usually let the client finish the current transaction. That would further indicate that the issue is likely on the client side of things.

A hang like this normally requires tracing to see what the last few calls were before the hang. Also, a bad performance problem or waiting for a resource can also give the appearance of a hang. If you only waited a few minutes, that it could be any of those things. If you waited hours, it's more than likely a hang.
 
PREDATAR Control23

Just one piece of info, when you cancel a session on the server, it's a graceful process. It will usually let the client finish the current transaction. That would further indicate that the issue is likely on the client side of things.

A hang like this normally requires tracing to see what the last few calls were before the hang. Also, a bad performance problem or waiting for a resource can also give the appearance of a hang. If you only waited a few minutes, that it could be any of those things. If you waited hours, it's more than likely a hang.
There was one occasion when I was on leave and the session hanged for more than 3 days, the tape was still "in use" and the session was still there, just without data transfer at all.
 
PREDATAR Control23

Hi,

Open a pmr with IBM. They can probably quicker find a root cause here. Would be nice if you could post the outcome.
 
Top