I've got a client who's failing to restore an Exchange store, and he's
not really getting any suggestive errors; this kind of surprises me,
I'm used to the problem being pretty nicely pointed to, one way or
another.
Windows Server 2003 R2 SP1
Exchange agent Version 5, Release 4, Level 0.2
My TSM server is Version 5, Release 3, Level 4.0
The full transfers neatly, and then when he starts working on applying
incrementals, the following happens:
He sees (e.g.)
06/14/2007 16:01:22 ANS1017E Session rejected: TCP/IP connection failure
while I see (these are the enclosing lines which mention the target
server)
06/14/07 15:47:22 ANE4991I (Session: 28805, Node:
EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange:
Starting full restore of storage group Housing Storage Group 1 to serv
er EXCHANGETEST.(SESSION: 28805)
[...]
06/14/07 15:59:42 ANR0481W Session 28805 for node
EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) terminated - client did not respond
within 60 seconds. (SESSION: 28805)
[...]
06/14/07 16:01:23 ANR0406I Session 28839 started for node
EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip
x47-215.housing.ufl.edu(8801)). (SESSION: 28839)
06/14/07 16:01:23 ANE4991I (Session: 28839, Node:
EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange:
Starting incr restore of storage group Housing Storage Group 1 to server
EXCHANGETEST.(SESSION: 28839)
06/14/07 16:01:23 ANE4993E (Session: 28839, Node:
EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3508 Data Protection for Exchange:
incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST
failed, rc = 425.(SESSION: 28839)
And then there are another 50-100 iterations of the "Starting incr
restore / failed with rc = 425"
The rc=425 doesn't appear to correlate with either an API return code
or an agent code. The 3508 doesn't enlighten much, sounds like it may
be a RC from the exchange server?
In any case, I don't have any rejected connections to the box, so it
looks to him like I rejected, and looks to me like he went away.
I'm thinking that the connect failure is in fact a delayed message
from whenever the exchange client got busy (say at 15:58:42, 60
seconds before the termination server message) and a few minutes later
the agent gets around to sending packets down that pipe, and finds a
failure.
I understand the agents to be notoriously bad at picking up the
connections, and googling around found similar errors to be well
addressed by increasing COMMTIMEOUT on the server. I increased my
COMMTIMEOUT to a few hours, to be sure I wasn't blowing him off
unnecessarily, and he started failing with a
06/15/07 11:51:45 ANR0480W Session 31503 for node
EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) terminated - connection with client
severed. (SESSION: 31503)
at "about" the same point in the process. He reconnects, and then
06/15/07 11:51:45 ANR0406I Session 31845 started for node
EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip
x47-215.housing.ufl.edu(2432)). (SESSION: 31845)
06/15/07 11:51:45 ANE4991I (Session: 31845, Node:
EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3506 Data Protection for Exchange:
Starting incr restore of storage group Housing Storage Group 1 to server
EXCHANGETEST.(SESSION: 31845)
06/15/07 11:51:45 ANE4993E (Session: 31845, Node:
EXCHANGE-IS.HOUSING.UFL.EDU) TDP MSExchg ACN3508 Data Protection for Exchange:
incr restore of storage group Housing Storage Group 1 to server EXCHANGETEST
failed, rc = 425.(SESSION: 31845)
I don't think there's an option analogous to COMMTIMEOUT for the
client side, am I missing something? I'm gearing up to open a ticket,
but I figured I'd send up a flare Just In Case.
- Allen S. Rout
|