ADSM-L

ADSM Version 2 disaster recovery

1995-06-23 11:55:50
Subject: ADSM Version 2 disaster recovery
From: Greg Tevis <gtevis AT VNET.IBM DOT COM>
Date: Fri, 23 Jun 1995 08:55:50 PDT
There was a request just made to see more details on the
new disaster recovery features of ADSM version 2.  I have
appended a part of a document on V2 functions which describes
these new features in some detail.  The document was written
in script and I just left the tags in for those who can use
script...I apologize to those who use other formatters (this
was the fastest way to share the info  :-)  ).  This document was
written as a V2 internal guide for ADSMers by Tim Mortimer from
IBM UK.  Tim did a good job with this writeup.  There will also
be a complete description of these new features in the V2 Admin
Guide (along with a number of recommendations).  Hope this helps.

Greg Tevis
ADSM Technical Support

 =======================================================================

:h2.ADSM Version 2 Server Functions
.*
:p.The new functions provided by ADSM Version 2 such as client
Hierarchical Storage Management affect the availability requirements
of the ADSM server. An existing ADSM Version 1 server is a repository
for backup and archive data. This type of data can be regarded as
being :q.inactive:eq., ie&semi. copies of data that also exists on the
managed client workstations. In this type of environment the availability
of the ADSM server is important but not necessarily critical. If the
ADSM server is temporarily off-line for maintenance purposes the users
still have access to their local workstation data. At worst the users
might be delayed in recovering a backup copy of a file.
:p.With the introduction of new functions such as client HSM and
possibly other
applications using ADSM as a data store the requirements change.
An ADSM server now becomes a repository for :q.active:eq. data.
For example, files that has been migrated because of their age or
inactivity will no longer be on the client workstation filesystem. The
migrated copy will reside only on the ADSM server. In this environment
the availability of the ADSM server is absolutely critical. If the server
is unavailable for any reason, the enduser might not be able to access
data that they need.
:p.Typically there are a number of reasons why an ADSM server or the
data managed by it, might be unavailable&gml.
:ul c.
:li.The server is offline while the ADSM database is backed up
:li.The server is offline while the storage pools are backed up
:li.A hardware failure affecting either the ADSM database or
storage pools
:li.A media failure such as a corrupted tape cartridge
:eul.
ADSM Version 1 provided a number of functions to assist in the
management of the server and it's resources&gml.
:ul.
:li.Database and Recovery Log mirroring
:p.The ADSM database and recovery log volumes can be mirrored to
protect against a device hardware failure. However this does not
protect against a logical corruption of the database.
:li.Server export and import functions
:p.The contents, either fully or partially, of the ADSM server can
be exported and taken offsite for disaster recovery purposes. However
this could be a long process on a large server and might effect users
accessing the server.
:li.Database dump and salvage utilities
:p.These utilities enabled a damaged ADSM database to be dumped, restored
and rebuilt. However they are very disruptive to the server. The ADSM
server would be offline while they were running, which can be a lengthy
process. These utilities can also only recover the undamaged portions
of the ADSM database. Typically this will result in some client
workstation data being lost. These utilities should be regarded as a
:q.last resort:eq. option if the ADSM database becomes corrupted and
where no current backups of the ADSM database exist.
:eul.
Apart from these ADSM utilities the most common method of ensuring
the server availability is to shut the server down and use some
other external backup product.
:p.The main focus of the ADSM Version 2 servers is to enhance the
internal availability functions. The objective is to ensure that the
server database can be backed up and recovered in a consistent manner.
The storage pools can be backed up and protected against hardware or
media failure. Both these tasks can be automated and cause little or no
disruption to the ADSM server. The end result is improved ADSM server
availability and increased offsite disaster recovery capability.
.*
:h3.Server Database Backup and Recovery
:p.ADSM Version 2 provides new functions to perform two types of
backup of the ADSM server database&gml.
:ul.
:li.Full backup
:p.A full backup performs a backup of the entire ADSM server database.
A full backup begins what is known as a :q.database backup series:eq..
A database backup series consists of a full backup and up to 32
incremental backups based on that full backup. After the maximum 32
incremental backups have been performed, a new backup series must be
started, commencing with a new full backup. The number of incremental
backups in a series is configurable. The only limitation is a maximum
number of 32.
:li.Incremental backup
:p.A incremental backup of the ADSM server database is a backup of the
changed database pages since the previous backup&semi. either a
full backup if this were the first backup in a series,
or the previous incremental backup if part way through a backup series.
Incremental backups are optional. It is possible to just perform
repeated full database backups.
:eul.
A full database backup must be performed in the following
circumstances&gml.
:ul c.
:li.The server database has not previously been backed up
:li.The maximum number of incremental backups (32) has been reached
in a backup series
:li.The database has been extended or reduced in size
:li.The database has been recovered or loaded using the ADSM database
salvage utilities
:eul.
:p.There are a number of considerations when planing a strategy for
backing up the ADSM database. Full backups take longer than incrementals.
However recovery time of a full backup could be considerable faster.
A database recovery using a backup series consisting of a full backup
and a number of incrementals would require the full backup to be
restored first. Then the incremental backups would need to be
restored one at a time until the required recovery point is reached.
These considerations need to be balanced when planning the approach
to database backup.
:h4.ADSM Recovery Log
:p.With Version 2 the ADSM recovery log's function changes. In Version 1
it is used for two main purposes. To provide a checkpoint mechanism for
database pages being updated and to assist in transaction check pointing
between ADSM client and servers during backup or archive operations.
With Version 2 the recovery log now has two modes&gml. NORMAL or
ROLLFORWARD. When the log is set to NORMAL mode it performs the same
function as in Version 1. When set to ROLLFORWARD mode it is used to
perform forward recovery operations on the ADSM
database during a database
recovery. In this mode all updates to the database since the previous
full or incremental backup are recorded in the log. These entries can
then be used to :q.roll forward:eq. the database to the point of
failure.
:p.The recovery log mode will effect the size of the log. In normal mode
it's size will be similar to that of a comparable Version 1 server. In
ROLLFORWARD mode the size will increase. The increase in size will
depend on frequency of database backups and volume of database
transactions. When a
database backup is performed the database updates stored in the log
since the previous backup are deleted. This is because they are no
longer required for forward recovery as a new backup has been taken.
:h4.Initiating Database Backups
:p.An ADSM administrator can perform database backups in a number of
ways&gml.
:ul c.
:li.Scheduled backups
:li.Automatic backups
:li.Manual backups
:eul.
With Version 2 a new command, BACKUP DB is used to perform either
a full or incremental backup of the server database. The choice of
full or incremental backup are parameters for this command. By default
a full backup is performed. The BACKUP DB function can also be
initiated using the new Version 2 administrator GUI. The BACKUP DB
command can be scheduled using the new ADSM administrative command
scheduling function. An example of how this might be implemented is
that a full database backup is scheduled once a week, probably over a
weekend. Incremental database backups could then be scheduled daily
during the week at a time that doesn't interfere with the scheduled
client workstation backups.
:p.A second method of invoking an automatic database backup is to
define a :q.trigger:eq. that starts a database backup when a certain
condition occurs. For example, if the recovery log is in ROLLFORWARD
mode it will gradually fill up with database updates. If this is
allowed to continue without being reset by a database backup, then the
recovery log will fill up. This will result in the server stopping.
A new command, DEFINE DBBACKUPTRIGGER can be used to define the maximum
recovery log utilisation before a database backup is taken automatically.
When that utilisation is reached, the trigger automatically invokes a
database backup. This is a useful safety valve to ensure that the
recovery log does not fill up before the next scheduled backup.
:h4.Database Recovery
:p.With ADSM Version 2 there are three types of recovery for the
database&gml.
:ul c.
:li.Point-in-time recovery
:li.Roll forward recovery
:li.Single database volume recovery
:eul.
A point in time recovery is used to restore the ADSM database to
point in time when a specific backup was taken. It will restore the
database contents to a state at a particular backup within a backup
series. The DSMSERV RESTORE DB command is used to perform a point in time
backup. This is a standalone ADSM command that requires the server to
be halted.
The database is restored in a consistent state so there is
no requirement to audit the database following the restore.
Any updates to the database after the backup used for the restore
will be lost. For
that reason, following a point in time backup the storage pool volumes,
both disk and tape must be audited with the AUDIT VOLUME command. This
will resolve any inconsistencies between the restored database and
the storage pools.
Point in time recovery can be used when the recovery log
is in either NORMAL or ROLLFORWARD mode. However none of the database
records created in ROLLFORWARD mode are used for this type of recovery.
Point in time recovery restores the entire logical database. It cannot
be used to restore an individual database volume. It is particularly
useful where there is a need to recover the entire ADSM database to
point in time, either because of some logical problem, corruption or
disaster recovery requirement.
:p.Roll forward recovery is used to recover the server database to
its most current state. To achieve this it uses the latest full
backup, any incremental backups subsequent to this full backup and
any recovery log records created since the last database backup.
A roll forward recovery requires that the recovery log is in ROLLFORWARD
mode. As with point in time, the DSMSERV RESTORE DB command is used to
perform a roll forward recovery. This recovery method can only be used
to recover the ADSM database to its most recent state. To achieve this
the ADSM recovery log MUST be available. If it is not available then
only a point in time recovery to the latest available backup is possible.
For this reason, mirroring of the recovery log is highly recommended if
roll forward recovery is desired. As roll forward recovery restores the
database to its latest state, no auditing of storage pool volumes is
required.
:p.The final method of restoring the ADSM database is to restore a
single database volume. Typically the ADSM database consists of multiple
volumes. Point in time and roll forward recovery enable an entire
database to be restored as the result of an error. However there
is the possibility, due to hardware or media failure that a single
database volume needs to be recovered. Rather than recovering the entire
database there is the ability to recover a single database volume. For
this type of recovery, again the recovery log must be in ROLLFORWARD
mode. A single volume can be recovered to its latest state, in the same
way as the entire database using the roll forward method discussed above.
This ability to recover a single volume is only available using the
roll forward method of recovery.
:p.These new database backup and recovery capabilities enhance the
availability of the ADSM Version 2 servers. They will reduce the need
for planned system outages for the server and provide additional
capabilities in the area of offsite disaster recovery.
.*
:h3.Storage Pool Backup and Recovery
:p.The second major enhancement for ADSM Version 2 servers is in the
area of storage pool backup and recovery. A major customer requirement
for ADSM is the ability to duplicate client backup and archive data
in storage pools. This is required to
provide protection against media failures and for disaster recovery
purposes. ADSM Version 2 provides the capability to duplicate storage
pool data incrementally to a secondary local storage pool and to an
offsite storage pool if required. This capability provides protection
against local hardware or media failure and managed offsite disaster
recovery capabilities.
:p.The ADSM Version 2 storage pool availability feature has been
implemented with the following objectives&gml.
:ul c.
:li.Provide the capability to perform incremental backups of
ADSM storage pools to one or more :q.Copy Storage Pools:eq. without
interrupting normal ADSM operations
:li.Minimise backup time by performing incremental backups to copy
storage pools
:li.Provide the ability to recover either a single or multiple volumes
in the event of hardware or media failure
:li.Provide automatic switching to a copy storage volume if the primary
is unavailable
:li.Support the requirement to move copy storage volumes offsite for
disaster recovery purposes
:li.Integrate with the ADSM Version 2 Database backup feature
:eul.
The basis for this function is a new type of ADSM server storage pool,
the :q.copy storage pool:eq..
.*
:h4.Copy Storage Pools
:p.With ADSM Version 1 there was only one type of storage pool, a
primary storage pool. Primary storage pools are the repositories
where workstation client backup and archive data is stored.
A primary storage pool can contain either disk, optical or tape volumes.
Typically multiple primary storage pools are :q.chained:eq. together
to form a storage pool hierarchy. As storage pools fill up with client
data, aged data is migrated down to a lower level storage pool in
the storage hierarchy.
:p.ADSM Version 2 introduces a new type of storage pool, the copy
storage pool. Copy storage pools can only contain sequential media
ie&semi. tape volumes, and are used for storing duplicate copies of
files that reside in primary storage pools. Copy storage pools and
the volumes they contain, are
defined by ADSM administrators the same way as with primary pools
using the DEFINE STGPOOL and DEFINE VOLUME command or with the
administrator client.
An entire primary storage pool hierarchy can be backed up to one or
more copy storage pools.
.*
:h4.Backup Of Primary Storage Pools
:p.A primary storage pool can be backed up to a copy storage pool
using the new BACKUP STGPOOL command. This new command specifies
the name of the primary pool being backed up and the destination
copy pool&gml.
:xmp.

   BACKUP STGPOOL 'primarypool' 'copypool' maxprocess=4

:exmp.
This example backs up the primary pool to the copy pool and uses
a maximum of 4 backup processes. The BACKUP STGPOOL command
can be scheduled using the new administrator command scheduling function.
The first backup would be a full backup of the contents of the
primary pool. Subsequent backups would be incrementals of files that
have been added to the primary pool since the previous backup.
A primary pool can be backed up multiple times to different copy pools.
For example a primary pool might be backed up to a local copy pool to
provide local protection against hardware and media failures. The same
primary pool could also be backed up to an off-site copy pool to
provide disaster recovery capabilities.
An entire primary hierarchy of storage
pools containing disk and tape pools can be backed up to a single copy
storage pool. As client files migrate down the primary pool hierarchy
ADSM updates it's database to reflect their location without the need
to recopy them from the new primary pool to the copy pool.
An ADSM server supports a
maximum of 250 storage pools, either as primary or copy pools.
A backup copy pool cannot be specified as a storage pool location in
a management class. Data can only be moved into a copy pool by using
the BACKUP STGPOOL command.

.*
:h4.Restoring Data From Copy Pools
:p.There are a number of circumstances when it will be desirable to
restore data from a copy storage pool&gml.
:ul c.
:li.One or more client files in the primary storage pool are unavailable
due to a hardware or media failure
:li.A complete volume in the primary storage pool is unavailable
due to a hardware or media failure
:li.An entire storage pool is unavailable due to some major hardware
failure
:li.A disaster has occurred and the ADSM server needs to be recovered
at an offsite location
:eul.
:p.ADSM supports recovery in all these scenarios. If local copy storage
pools are used for backups then the duplicate client data contained
in them are automatically available to the ADSM server. When a
workstation client attempts to restore data it will be obtained from
the primary storage pool. If for some reason the data cannot be read
from the primary pool volume then the ADSM server will automatically
read the duplicate data from the copy storage pool volumes. This
is transparent to the enduser at the workstation.
:p.This problem could be more serious, such as an entire primary
storage pool volume being damaged. In this scenario an ADSM administrator
can restore the complete primary volume using the new RESTORE VOLUME
command. This command will re-create the data that was on the damaged
volume. The data will be restored from the copy storage pool and placed
on other volumes in the primary storage pool. The original failing
volume will be marked as :q.DESTROYED:eq. and will not be reused.
If a large number of volumes or an entire primary storage pool needs to
be recovered then the new RESTORE STGPOOL command can be used. This
performs the same function as the RESTORE VOLUME command. It will restore
all volumes in a primary pool that have been marked as DESTROYED. Again
re-creating the data from the copy storage pool.
:p.The above recovery scenarios assume that the copy storage pool volumes
are onsite and can be mounted on request. If a copy storage pool is
being used for offsite disaster recovery then it is managed differently.
If a copy storage pool is being used for offsite
disaster recovery purposes then those volumes created within it can be
marked as :q.OFFSITE VOLUMES:eq.. The ADSM server will not attempt
to mount these volumes to recover data in the manner described above.
These offsite volumes can be physically moved offsite until required
for a recovery.
:h4.Storage Pool and Database Backup Summary
:p.The new storage pool backup capabilities provide comprehensive
onsite and offsite backup capabilities. Used in conjunction with
the new ADSM database backup functions they provide a very high level
of availability for ADSM Version 2. Both functions can be automated
with the new administrator scheduling function. It is strongly
recommended that database and storage pool backups are coordinated
so that they are run together. Doing this will minimise any
inconsistencies between the database and the storage pools in the
event of having to restore them.
.*
<Prev in Thread] Current Thread [Next in Thread>
  • ADSM Version 2 disaster recovery, Greg Tevis <=