ADSM-L

Re: [ADSM-L] MSSQL Server Backup

2007-12-17 11:15:44
Subject: Re: [ADSM-L] MSSQL Server Backup
From: William Boyer <bjdboyer AT COMCAST DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 17 Dec 2007 11:00:43 -0500
Had a client of mine have some backup "hangs" when he started loading servers 
on a new bladecenter. He opened a PMR and this is the
response from Level2....Don't know if this is your problem...


I've reviewed it and the files that you sent in and this issue appears to be 
related to Known Issues with the Microsoft Windows
Server 2003 Scalable Networking Pack  (SNP) which is installed and enabled by 
default by Windows 2003 Service Pack 2..
    For more information please refer to the following webpages:
    http://support.microsoft.com/kb/912222
    http://support.microsoft.com/kb/936594/en-us

The issue also appears to be tied to gigabit ethernet NIC's with Broadcom 
*CHIPSETS*. Note that several different manufacturers are
using Broadcom Gig-E chipsets in their NICs. I've confirmed that you *are* 
using Broadcom BCM5708S NetXtreme II GigE NICs.

The approach that has addressed this issue for most of our customers has been 
to disable the SNP functionality in Windows and on
their NIC's driver settings dialog, after ensuring that the NIC's firmware and 
device drivers are the most current version available
from their vendor.

NOTE: For the record, I am *not* a Broadcom/HP/Microsoft support engineer and 
am passing you general information that has been
reported back by other customers seeing extremely similar symptoms/issues in 
their environment. The OEM or vendor for your
customer's environment should be contacted to confirm any changes external to 
TSM should you have any questions regarding them or
how to make them.

The following steps have worked for several of our customer to address this 
issue :
1. Confirm that the affected nodes all have Gig-E ethernet cards. (Since 
they're all the same type of Blade, they should have the
same BCM5708S NetXtreme II GigE NICs.)
2. Determine if the Gig-E NIC's have Broadcom chipsets. You might need to 
contact the vendor to confirm this. (I've already
confirmed this from you doc.)
3. Confirm that the Gig-E NIC's are at the most current firmware & driver 
levels.
4. Disable all the "advanced" features on the NIC that might be related to SNP. 
Note: Different models of NIC adapter will describe
the features differently. Any feature that contains the word "offload" should 
be disabled, as well as "Receive Side Scaling" or RSS.
Your settings might look like the attached "NIC_Settings.jpg" file.
5. Disable SNP in the OS via the Registry. NOTE: This process will require a 
REBOOT. Please see the attached "SNP_Registry.jpg"
    A. Modify the registry to disable Receive Side Scaling (RSS)
        1. Click Start, click Run, type regedit, and then click OK.
        2. Locate and then click the following registry subkey:
            
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
        3. If the EnableRSS registry entry does not exist, create it.
            To do this, follow these steps
            a.  On the Edit menu, point to New, and then click DWORD Value.
            b.  In the New Value #1 box, type EnableRSS, and then press ENTER.
        4. In the details pane, right-click EnableRSS, and then click Modify.
        5. In the Value data box, type 0 (zero), and then click OK.
    B. Modify the registry to disable TCPA support
        1. Locate and then click the following registry subkey:
            
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
        2. If the EnableTCPA registry entry does not exist, create it.
            To do this, follow these steps
            a.  On the Edit menu, point to New, and then click DWORD Value.
            b.  In the New Value #1 box, type EnableTCPA, and then press ENTER.
        3. In the details pane, right-click EnableTCPA, and then click Modify.
        4. In the Value data box, type 0 (zero), and then click OK.
    C. Modify the registry to disable TCP Chimney support  (TCP Offload)
        1. Locate and then click the following registry subkey:
            
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters
        2. If the EnableTCPChimney registry entry does not exist, create it.
            To do this, follow these steps
            a.  On the Edit menu, point to New, and then click DWORD Value.
            b.  In the New Value #1 box, type EnableTCPChimney, and then press 
ENTER.
        3. In the details pane, right-click EnableTCPChimney, and then click 
Modify.
        4. In the Value data box, type 0 (zero), and then click OK.
        5. Exit Registry Editor.
6. Reboot.


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Fred Johanson
Sent: Monday, December 17, 2007 10:46 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: MSSQL Server Backup

Mahesh,

This came up, in a slightly different fashion, last week.  As I noted then, the 
last thing in a trace on the TDP is a TCPFLUSH.  We
haven't tried a trace with the .BAK, but I wouldn't be surprised if it was 
similar.  A new NIC gave some relief, but Support has
been scratching their coolective heads for months.

________________________________

From: ADSM: Dist Stor Manager on behalf of Mahesh Tailor
Sent: Mon 12/17/2007 9:33 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: [ADSM-L] MSSQL Server Backup



Hello.

TSM: 5.3.4.1 on AIX 5.3 TL5
TSM BA Client: 5.3.4.0 (also tried a new patch level) TDP for MSSQL: 5.2.1.6

Having a very odd problem.

I have several TDP for MSSQL clients having a similar issue.  A backup starts 
at the scheduled hour.  All files that are <MAXSIZE
backup fine.  If a file that is >MAXSIZE is found, on the client we see a 
"Waiting for Tape Mount" message in the log file.  On the
server we see a tape mounted, however the client never proceed and the backup 
hangs.  We've let a backup sit until the session
times-out and beyond to no avail.  To complicate this further, we took TDP out 
of the equation and created a MSSQL dump backup.
When we try to backup the dump using the regular TSM BA client, we get the same 
result.

We thought network, however while the client is hung if we open an ftp session 
to the TSM server and do a dd transfer of ~3GB, no
problems.

This is happening only on some nodes.  Others are fine.

Has anyone else seen this?  And, if so what was the resolution?

TIA

Mahesh

<Prev in Thread] Current Thread [Next in Thread>