Dean Roy <DEAN AT ALPHA.UWINDSOR DOT CA> wrote:
>I am currently looking at ADSM/6000 as a possible storage management
>solution for our shop.
>
>I am looking for information on how ADSM works with file systems that
>are mounted via NFS. My situation is this:
>
> I have users who use one machine and think the files are local but
>in reality they are NFS mounted from a central NFS server. Users are
>not allowed to logon to the central NFS server. My concern is how ADSM
>handles this situation. Which machine would backups be run from - the
>NFS server or client? Does ADSM allow a user on the client machine to
>restore a file to the NFS server?
We are new ADSM users and encounter this as a problem, although a solvable -
or at least "workable-around" - one. We have got the impression that ADSM
was first and foremost designed for a collection of *individual* PCs and
workstations and that the support of clusters is still somewhat
half-hearted. In discussions with IBM we felt that clusters as they exist in
universities (we, too, are a university computer centre) are slowly getting
into focus in ADSM development but I leave it up to them to display their
plans.
For these discussions, I have prepared a paper that identifies the typical
problems one might encounter in environments like yours and ours. The
purpose of this paper explains its critical undertone. I append my paper to
this letter.
Hope this helps. Best regards,
Helmut Richter
============================================================================
Dr. Helmut Richter
Leibniz-Rechenzentrum X.400: S=Richter;OU=lrz;P=lrz-muenchen;A=d400;C=de
Barer Str. 21 RFC822: Helmut.Richter AT lrz-muenchen DOT de
D-80333 Muenchen Tel.: ++49-89-2105-8785
Germany Fax: ++49-89-2809460
============================================================================
Usability of ADSM In a Multi-User Cluster Environment
=====================================================
In various requests, the need for a better support of workstation
clusters serving many users (in contrast to many independent
workstations each serving a single user) was pointed out. A large
number of single requests may be helpful for prioritizing necessary
updates to the product; however, the overall picture of what is needed
may get lost. In this paper, I try to summarize the most important
requirements irrespective of how or when they should be implemented.
Contents:
1. Requirements for multi-user sites
2. Requirements for clusters
3. Additional requirements for AFS or DFS clusters
---
1. Requirements For Multi-User Sites
1. Requirements For Multi-User Sites
------------------------------------
In this section, the problems with using ADSM on workstations with
In this section, the problems with using ADSM on workstations with
many users are summarized. Additional problems arising only in
clusters interconnected via NFS, AFS, or DFS are postponed to later
sections.
To understand the requirements, it is important to distinguish between
different classes of ADSM users, each having different requirements:
a) Individual ADSM users
An "individual" ADSM user is one who constitutes a "node" of his own
with a password known to this user only. Such a node typically refers
to the data on the user's PC or workstation. Individual users have the
requirement to back up or archive data and retrieve which was backed
up or archived by themselves or by any other users granting them
explicit access rights under ADSM.
b) Workstation administrators (root users)
As far as system files are concerned, a workstations administrator has
the same requirements as an individual user of ADSM. In addition to
that, he has also requirements for operations on data that belong to
users other than himself: he must be able to make regular backups of
the entire user space, reload files that have been destroyed (e.g. by
disk hardware failure) and reload single files on demand of users. The
backups produced by the administrator should be reloadable by the user
who owns the data but not by any other user.
c) Users who do not administer the workstations they are using
These are users that have no individual passwords for ADSM, thus,
either their Unix or their Kerberos validation must be used to
identify them against ADSM ("generate" password mechanism).
Typically, these are the average users of the workstation who are not
concerned with its administration. These users have only a restricted
set of requirements: they must be able to archive data and retrieve
them and to reload data that have been backed up by the workstation
administrator. The ability to trigger backups themselves should be an
option at the discretion of the administrator; in case this option is
allowed, user-initiated backups should merge with regular backups
(i.e. they should behave as if the administrator has made a backup, in
particular inhibit another selective backup until the next
modification of data).
d) ADSM administrators
Whoever administers ADSM (the entire server or subsets thereof) needs
the ability to control access to ADSM, to define policies, and to
monitor and control the amount of resources for each user. In
addition to the facilities currently supported by ADSM, there is a
requirement to provide space management (data migration).
There are no problems involved with individual ADSM users (class (a)
above), in fact, ADSM seems to be designed first and foremost for
them. However, in multi-user support, i.e. in an environment where
most users belong to class (c) above, ADSM has the following
shortcomings:
(1) Backups by the workstation administrator are on a directory
hierarchy level which is not suitable for the end user. Yet, in the
graphical user interface for restore, this level, i.e. a point above
the user's home directory is offered for the search, thus forcing the
user to have directories of all peer users browsed in order to select
his own. This is not only extremely user-unfriendly as an interface
but very time-consuming, too. For both reasons, use of the command
line interface, clumsy as it is, is still easier. A sensible
implementation would offer both the user's home directory and the
user's current working directory as starting points for the search
through the database.
(2) { Item deleted. }
(3) The multi-user case means that many users share one ADSM node. But
then, no statistics whatsoever about the resources taken up by each of
the users belonging to that node are offered to administrators, let
alone any aid for pro-active management of resources, e.g. with quota.
(4) The protection model is described in the manuals in a slipshod
manner. For example, the fact that a root user is defined as a user
who knows the node's password (and not at all a user who is "root" on
the system) must be found out in tedious experiments, and failure to
do so may result in severe security flaws.
2. Requirements For Clusters
----------------------------
Today's client/server configurations typically consist of a number of
Today's client/server configurations typically consist of a number of
client workstations sharing data served to them by a file server via
NFS, AFS or DFS. The file server in turn often consists of more than
one physical machine, especially in a DCE environment. In the sequel,
we call such a configuration a "cluster" but it should be kept in mind
that clusters are not separated from each other: the sets of
workstations where a given collection of data is accessible will
overlap. This happens already with NFS clusters but much more so (and
world-wide) in a DCE environment.
ADSM, however, is not designed to cope with a situation that many
workstations share the same data:
(5) The notion of "node" in ADSM typically refers to one workstation.
In a clustered environment, however, a "node" should refer to a
collection of files spanning more than one workstation but excluding
the local filesystems of each of the affected workstation. Such nodes
are not supported under ADSM.
An obvious work-around is to use the same node name on all
workstations belonging to one cluster; otherwise, files backed up or
archived on one workstation in the cluster cannot be restored or
retrieved on another workstation in the same cluster (in larger
clusters, one would typically even have a server to which the ordinary
user has no access). As node names are the unit of security in ADSM,
using the same node name and password on many workstations is unsafe
from a security standpoint. Also, overlapping regions where files are
accessible are not properly handled by this work-around.
The combination of scheduled backups that partially use the default
node (i.e. the node common across all workstations in the cluster)
and the specific node for the private files of the workstation is
tedious to set up properly because of skimpy documentation.
(6) In a clustered environment, files usually do not physically reside
on the same host where the ADSM client is invoked. This renders the
transfer of files between ADSM server and client extremely
inefficient: If a user on Workstation "NC" archives a file to the ADSM
server "AS", the file is transferred from the NFS server "NS" to "NC"
(so that the ADSM client is able to access the data) and from there to
"AS". In many network topologies, the line between "NC" and "NS" is
the only connection between "NC" and the rest of the world, so that in
this case the transfer goes like this:
NS --> NC --> NS --> AS
Upon retrieval of the file, the same detour is taken again.
Note that items (5) and (6) are closely related:
If there were a support of distributed file systems in ADSM, the file
to be archived would be identified as belonging to the cluster and not
to the individual workstation. As a result, the ADSM node name of the
cluster would be selected (solving item (5)) and transfer would be
restricted to take place between "NS" and "AS" (solving item (6)).
In other words, ADSM should be able to take the possibility of
distributed file systems into acount. This is independent of whether
NFS, AFS, or DFS is the underlying distributed file system. However,
the data about the file system that must be kept within ADSM may be
different for the three cases.
3. Additional Requirements For AFS or DFS Clusters
--------------------------------------------------
If, in a cluster, the distributed file system is AFS or DFS, the above
If, in a cluster, the distributed file system is AFS or DFS, the above
problem areas may require other solutions. Also, the use of these file
systems may raise additional requirements. All these requirements, as
listed below, are not currently fulfilled with ADSM.
(7) As pointed out in (4) above, ownership and access rights under
ADSM are not precisely enough defined. As far as Unix file ownership
is used to this end, this becomes very questionable in a DCE
environment because there Unix file ownership is irrelevant.
Instead of creating new access control rules for ADSM to reflect this
scenario, the only reasonable solution to these problems is to respect
the access control information of AFS or DFS under ADSM as well. The
backup of an AFS or DFS file must be readable for a user if and only
if its original was - everything else creates security holes because
users will be unable to overview how the incompatible access controls
of DCE and ADSM interact. This must entail an option for the user to
change the access control list of a file copy that is currently in the
realm of ADSM.
(8) The preceding item is hardly conceivable without using Kerberos
(and the same Kerberos as used by AFS or DFS) as the authentication to
ADSM. Sharing passwords among many users, or keeping a separate
password for each user is no viable security strategy. At present, if
an intruder manages to get another user's UID but not a valid Kerberos
ticket, he cannot read the compromised user's files but he can read
their ADSM backups. The additional security of Kerberos is thus
undermined.
(9) As pointed out in (5) above, the entity an ASDM node refers to
should be a collection of possibly distributed files and not a single
workstation. In the AFS and DFS case, these collections have names,
the AFS or DCE cells. Inasmuch as ADSM node names serve to distinguish
different files with equal names, this function should be taken over
by cell names.
(10) For AFS or DFS files, the transfer of data directly to and from
the physical location of the file (see (6) above) is facilitated by
the fact that these file systems provide an interface to determine
this location. Note that the physical location of an AFS volume or DFS
data set may change at any time without giving notice to any
application, including ADSM.
(11) Recovery from disk failure requires more information than just
the files and their access control lists. At present, this can only be
achieved by backing each data up twice: once on a file basis to allow
for individual restoration and a second time on a volume basis to
allow for disaster recovery. The second such backup is only available
as an unsupported additional feature from IBM.
The correct solution is probably to separate volume data from file
data in the backups so that volumes may be backed up and restored
without their data. For disaster recovery, one would first restore the
affected volumes and then fill them up with missing files.
(12) Space management requires modifications to DCE's layout of file
data: meta-data must be kept separately from file data in order to
allow relocation of the latter without affecting the former. This
should typically include the option to have data stored at more than
one physical location. Both these features have been implemented by
Pittsburgh Supercomputer Center as "Multi-Residence AFS". A similar
feature in DFS is most urgently needed to allow space management.
Devising clean interfaces for implementing (11) and (12) is probably
the job of DFS development more than ADSM development. All the same,
IBM should start integrating these two efforts lest their ADSM product
become unusable as customers migrate to DCE.
-- end of text --
=========================================================================
|