ADSM-L

[ADSM-L] Exchange 2010 weird backup behavior

2014-05-05 14:00:32
Subject: [ADSM-L] Exchange 2010 weird backup behavior
From: "Prather, Wanda" <Wanda.Prather AT ICFI DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 5 May 2014 17:58:13 +0000
I suspect this is more an Exchange 2010 issue than a TSM issue, but I'm hoping 
somebody has seen something similar before and can tell me where to look.

About 15% of the time, one of our Exchange servers gears down and the weekend 
full backup runs waaaaayyyy slower than it should, and I can't figure out why.  
It just happens.  No evidence.

TSM server 6.3.4.300
TSM for Exchange 6.4.
Exchange 2010 in a DAG configuration
10G Ethernet for TSM and Exchange.

2 local Exchange servers on same network segment as TSM server, those get 
backed up.
 3rd Exchange server at DR site (across WAN), no backups.

About 24 mail DB's + PF's, approx. 8 TB total, evenly divided between the 2 
local Exch servers.
Each of the local Exch servers has half active, half passive DB's.
DR Exch server has a third copy of the data, all those are passive and never 
backed up, no client installed.
Fulls run on Saturday, direct to LTO5 tape.
Both servers start backups around 10am, back up only their ACTIVE DB's.
On a "normal" Saturday, they run tickety boo and are finished in about 9 hours.

However, about twice a month, one (either one) of the local Exchange servers 
runs slow as dirt and can take 2-3-4 times as long as usual, up to 36 hours.

-They are on the same network segment as the TSM server
-The TSM server NIC is underutilized during this time
-Nothing else is happening on the server on Sat mornings except BACKUP STGPOOL.
-They run at the same time of day, so anything affecting one (regarding the TSM 
server or network) would have the same effect on the other, anyway
-The problem is as likely to occur on one as the other.
-Only once has it happened to both on the same day.
-The next weekend, everything may be fine, it may occur again on the same 
server, or the problem may switch to the other server.
-The server isn't getting disconnected from the network, or doing a lot of send 
retries during this time.
-Nothing in the Exch server event logs
-The only oddity on the client end, is that the log that has these messages 
will be missing entries for some data bases, even though they did get backed 
up.  I don't know if this is related to the problem.
"04/21/2014 21:16:51 The following database is being backed up: 
'MYDOGHASFLEAS'. The data is being transferred to the Tivoli Storage Manager 
server."
-If I look at the NIC on the slowed-down server, it still says it is running 
10G.
-If I kill the backup and restart it, it's just as slow.
-Can't reproduce on demand.
-This is the command that is running, very vanilla:

tdpexcc backup * full /tsmoptfile=dsm.opt /logfile=excsch.log /BACKUPMETHOD=VSS 
/EXCLUDEDB=TestDB /BACKUPDESTINATION=TSM /SKIPINTEGRITYCHECK /EXCLUDEDAGPASsive 
/MINimumbackupinterval=60 >> excfull.log

-Incrementals run during the week, but they are so short anyway that we've not 
investigated whether those run slow some days.
-The backups do eventually finish correctly, but having that tape drive tied up 
36 hours is problem, and it prevents us getting the Exchange backups copied and 
offsite when we should.

I have no idea where else to look at this point.

I'm ready to resort to goat entrails and that's gross, so if anybody can 
suggest where else to look I would appreciate it!

Wanda










**Please note new office phone:
Wanda Prather  |  Senior Technical Specialist  | Wanda.Prather AT icfi DOT 
com<mailto:Wanda.Prather AT icfi DOT com>  |  www.icfi.com<http://www.icfi.com> 
| 410-868-4872 (m)
ICF International  | 7125 Thomas Edison Dr., Suite 100, Columbia, Md 
|443-718-4900 (o)

<Prev in Thread] Current Thread [Next in Thread>