ADSM-L

Re: Recovery of Exchange - was: backing up Exchange

1999-07-27 19:28:02
Subject: Re: Recovery of Exchange - was: backing up Exchange
From: Chris Zaremba <zaremba AT US.IBM DOT COM>
Date: Tue, 27 Jul 1999 19:28:02 -0400
Trevor

I'm sorry to hear that you are having such problems backing up your Exchange
servers but I'm afraid you are incorrect to lay the blame on the exchange agent.
Remember that the exchange agent is dependent on the API provided by the
exchange server for backup and recovery and as such is bound by the limitations
of that API.  With that in mind, I offer the following observations:

>We have had an issue with our network infrastructure for some time and ADSM
session >regularly get disconnected. This is where the Exchange problem comes
in. Unlike >every other ADSM client type that we use, the Exchange agent does
not automatically >reconnect after a transient network failure. It just dies.
And it requires manual >intervention to restart the backup. Our Exchange
databases are now so large ( > 40 >GB) that the chances of getting a full backup
between communication failures is >almost non-existent.

It is true that the exchange agent does not offer an automatic retry capabality
but at the time the agent was shipped, it was not possible to offer that because
of a bug in the MS exchange API (see MS problem #  61654) that caused the
exchange server to hang if we attempted to end a backup prematurely (so as to
restart for a retry).  This problem has since been fixed and we do plan to
provide an automatic retry capability in the future.  However, if your network
is indeed not stable enough to complete a full backup between communication
failures, then I don't think that a retry will help your situation.  If a comm
error occurs and we lose the session to the ADSM server, then the transaction
that was in progress is lost and we have to restart the transaction after
re-establishing a session with the ADSM server.  The entire Info Store is sent
as a single transaction so a retry would have to restart at the beginning.  If
your network problems were only intermittent, then the automatic retry would be
helpful, but in your case, it sounds like you really need to fix your network
problem in order to complete a full backup of an object as large as the exchange
IS


>We have also uncovered another problem where, once an incremental backup fails,
we >cannot run another incremental backup until a full backup is performed.

This is a limitation of the exchange API.  Exchange requires that a full backup
be done whenever an incremental backup fails.

>And one last problem, because our full Exchange backups are so large, we direct
most >of them direct to tape (by setting a 15 GB maximum file size on the disk
storage >pool). We have uncovered a problem where if all tape drives are in use,
including >one by an exchange backup, the backups will be pre-empted by
housekeeping tasks such >as space reclaimation.

This too is something that is not controlled by the exchange agent.  Priorities
for server processes are determined by the ADSM server.  Have you considered
scheduling admin commands to reset the reclamation parameters to prevent
reclamation from kicking in when exchange backups are being done?

>This has been, and continues to be, extremely frustrating (and business crital)
to >us. And IBM to date have been far less responsive than they need to be.
Despite the >fact that we have been so long with backups, IBM have yet to
provide any acceptable >workaround.

Again, I'm sorry that you have been so frustrated by this problem and I hope
that this info will help you to understand that the real solution is to correct
your network problems.


Chris Zaremba
ADSM Agent Development
zaremba AT us.ibm DOT com