Re: ADSM NT Disaster Restore Testing

Richard Sims wrote:
>
> >As you can see, you have to go through a lengthy volume-restore before
> >you can proceed with you client restore.  We didn't find that too
> >acceptable.  This problem affects both v2 and v3.
>
> What an IBMer apparently told you about restoring files from a destroyed
> primary tape pool disagrees with the Admin manual.  To quote from the v2
> manual, chapter 14, under "Recovery from a Disaster":
>
>     If a user tries to get a file that was stored on a destroyed volume,
>     the retrieval request goes to the copy storage pool. In this way,
>     clients can be recoverd with no data movement at all.
>
> (The standard proviso applies, that the ADSM database be in sync with
>  the copy storage pool state.)
>
>         Richard Sims, BU

Richard...I think you misread my post.  I was talking about the case
where the destroyed volume has some files which do not have copies in
any storage pool ('query content <vol> copied=no' shows these files).
You will find that the client restore will quickly fail (with bitfile
errors on the server) and that there is no way, short of using gui to
selectively remove these files from the restore (a tedious process
possibly requiring multiple iterations, if you know the undocumented
commands to figure out what client file the bitfile refers to), to
complete the restore before you do the 'restore volume'.

This can occur in a disaster recovery situation where new data for a
client is sent to the server between a 'backup stgpool' and a 'backup
db' in your daily processing.  When you restore that ADSM db up to the
end of 'RECOVERY.SCRIPT.DISASTER.RECOVERY.MODE' the clients whose data
arrived after that last, fateful 'backup stg' will be difficult to
restore.  After you run 'RECOVERY.SCRIPT.NORMAL.MODE' those files will
be forgotten about and you can recover those clients using 'blanket'
restores (restore where you simply ask for everything).

This can happen if a client administrator does a backup without
respecting your daily drm processing cycle (which should be: 1) client
backups, 2) backup stgpool, 3) backup db, 4) drm tape movement &
prepare).  Sometimes there are automated processes which do selective
backups based on thresholds (e.g. scripted Oracle archive-log management
using HSM with migrate requires backup).

There are any number of valid reasons why the owner of a client would
like to backup outside your windows.  If you (ADSM administrator) could
say to them that the only effect was that, in a disaster recovery
situation, they would simply not be able to recover files that did not
make it before your db backup, then they would be relatively happy.

However, what you have to tell them is that their clients may be
difficult to restore until later (much later perhaps) into your DRM
recovery scenario, and will have to wait until you've recovered your
entire primary storage pool; at that point they recover everything
except those files.

What I would like is that, when a client restore asks for files that are
on a destroyed volume which do not have backup copies, it restores files
on said volumes which do have copies from copy stgpool volumes and
issues error messages like "File XXX could not be restored because
volume YYY is destroyed and no copies of that file exist in any copy
stgpool" for each file are generated and written to dsmc output and the
server activity log.  Current design is to fail and issue bitfile errors
into the activity log.

Sigh...

I believe we do have a design request to change this, correct Rejean
(Larivee)?

Sorry for going on so long on that...

Bruce

--
Bruce Elrick, PhD
Bruce Elrick, PhD
ADSM & SP Certified
belrick AT home DOT com