ADSM-L

Re: [ADSM-L] Exchage Agent Restore failure...

2007-06-15 16:05:37
Subject: Re: [ADSM-L] Exchage Agent Restore failure...
From: Del Hoobler <hoobler AT US.IBM DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 15 Jun 2007 16:04:35 -0400
Hi Allen,

RC=425 is Exchange API error. This normally means that
the Exchange Server found something that it did not like.
Are you doing the FULL restore separate from the INCR restores?
If so, are you turning off the "Run Recovery" option?
If you use the GUI and do them all together (FULL + INCR),
then you should turning ON the "Run Recovery" option
but if you are doing them in separate operations,
then you should turning OFF the "Run Recovery" option
for the FULL.. and only turn it on for the last INCR operation.

If that is not it...
... sometimes this means that the storage group being
restored to isn't prepared correctly, for example,
the databases don't exist or the databases names are incorrect.
Is the restore being done into a Recovery Storage Group?
If so, did the databases being restored get added prior
to initiating the restore?

There is no version "5.4.0.2" of the DP/Exchange client.
Maybe that is the TSM API level?
I am guessing the DP/Exchange is either 5.2.1 or 5.3.3.
If it is 5.3.3... make sure it is version 5.3.3.01.
What messages appear in the TDPEXC.LOG file?
What errors appear on the DP/Exchange GUI at restore time?

Opening a ticket is a good idea... so that IBM Service
can gather a DP/Exchange and TSM API trace to find out
what is going on.

I hope this helps.

Thanks,

Del

----------------------------------------------------

"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> wrote on 06/15/2007
03:40:29 PM:

> I've got a client who's failing to restore an Exchange store, and he's
> not really getting any suggestive errors; this kind of surprises me,
> I'm used to the problem being pretty nicely pointed to, one way or
> another.
>
> Windows Server 2003 R2 SP1
> Exchange agent Version 5, Release 4, Level 0.2
>
> My TSM server is Version 5, Release 3, Level 4.0
>
>
> The full transfers neatly, and then when he starts working on applying
> incrementals, the following happens:
>
> He sees (e.g.)
>
> 06/14/2007 16:01:22 ANS1017E Session rejected: TCP/IP connection failure
>
> while I see (these are the enclosing lines which mention the target
> server)
>
> 06/14/07   15:47:22     ANE4991I (Session: 28805, Node: EXCHANGE-IS.
> HOUSING.UFL.EDU)  TDP MSExchg ACN3506 Data Protection for Exchange:
> Starting full restore of storage group Housing Storage Group 1 to serv
> er EXCHANGETEST.(SESSION: 28805)
>
> [...]
>
> 06/14/07   15:59:42     ANR0481W Session 28805 for node EXCHANGE-IS.
> HOUSING.UFL.EDU (TDP MSExchg) terminated - client did not respond
> within 60 seconds. (SESSION: 28805)
>
> [...]
>
>
> 06/14/07   16:01:23     ANR0406I Session 28839 started for node
> EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip x47-215.housing.
> ufl.edu(8801)). (SESSION: 28839)
> 06/14/07   16:01:23     ANE4991I (Session: 28839, Node: EXCHANGE-IS.
> HOUSING.UFL.EDU)  TDP MSExchg ACN3506 Data Protection for Exchange:
> Starting incr restore of storage group Housing Storage Group 1 to
> server EXCHANGETEST.(SESSION: 28839)
> 06/14/07   16:01:23     ANE4993E (Session: 28839, Node: EXCHANGE-IS.
> HOUSING.UFL.EDU)  TDP MSExchg ACN3508 Data Protection for Exchange:
> incr restore of storage group Housing Storage Group 1 to server
> EXCHANGETEST failed, rc = 425.(SESSION: 28839)
>
> And then there are another 50-100 iterations of the "Starting incr
> restore / failed with rc = 425"
>
>
> The rc=425 doesn't appear to correlate with either an API return code
> or an agent code.  The 3508 doesn't enlighten much, sounds like it may
> be a RC from the exchange server?
>
> In any case, I don't have any rejected connections to the box, so it
> looks to him like I rejected, and looks to me like he went away.
>
> I'm thinking that the connect failure is in fact a delayed message
> from whenever the exchange client got busy (say at 15:58:42, 60
> seconds before the termination server message) and a few minutes later
> the agent gets around to sending packets down that pipe, and finds a
> failure.
>
> I understand the agents to be notoriously bad at picking up the
> connections, and googling around found similar errors to be well
> addressed by increasing COMMTIMEOUT on the server.  I increased my
> COMMTIMEOUT to a few hours, to be sure I wasn't blowing him off
> unnecessarily, and he started failing with a
>
> 06/15/07   11:51:45     ANR0480W Session 31503 for node EXCHANGE-IS.
> HOUSING.UFL.EDU (TDP MSExchg) terminated - connection with client
> severed. (SESSION: 31503)
>
> at "about" the same point in the process. He reconnects, and then
>
> 06/15/07   11:51:45     ANR0406I Session 31845 started for node
> EXCHANGE-IS.HOUSING.UFL.EDU (TDP MSExchg) (Tcp/Ip x47-215.housing.
> ufl.edu(2432)). (SESSION: 31845)
> 06/15/07   11:51:45     ANE4991I (Session: 31845, Node: EXCHANGE-IS.
> HOUSING.UFL.EDU)  TDP MSExchg ACN3506 Data Protection for Exchange:
> Starting incr restore of storage group Housing Storage Group 1 to
> server EXCHANGETEST.(SESSION: 31845)
> 06/15/07   11:51:45     ANE4993E (Session: 31845, Node: EXCHANGE-IS.
> HOUSING.UFL.EDU)  TDP MSExchg ACN3508 Data Protection for Exchange:
> incr restore of storage group Housing Storage Group 1 to server
> EXCHANGETEST failed, rc = 425.(SESSION: 31845)
>
>
> I don't think there's an option analogous to COMMTIMEOUT for the
> client side, am I missing something?  I'm gearing up to open a ticket,
> but I figured I'd send up a flare Just In Case.
>
>
> - Allen S. Rout

<Prev in Thread] Current Thread [Next in Thread>