ADSM-L

Re: Tape drive sharing in a 3494

1998-02-27 16:19:41
Subject: Re: Tape drive sharing in a 3494
From: Tom Bell <tom.bell AT WAII DOT COM>
Date: Fri, 27 Feb 1998 15:19:41 -0600
> The sharing you want to do is ONLY supported with the single-host
> configuration, ie.  two adsm instances on one box as your two servers.  The
> scsi standard for dual-porting the drives to two different hosts is not yet
> supported in the drive microcode.  That is why it is very important that one
> drive NOT be available to two systems at the same time.

We have drives dual-ported to different systems, but our procedure for
switching the active host specifies that the on-line port is taken
off-line BEFORE the off-line port is placed on-line.  This is used
primarily for giving drives to test systems or to production systems
that only need access to 3590s periodically.  IF you give two hosts
simultaneous access to a drive, your recovery may involve the dreaded
word "reboot".

> On the same host, however, it's a different story.   ADSM does tolerate two
> applications, eg. two adsm servers, sharing the 3494 drives, but "tolerate" is
> a better word than "support"  IMHO.  The main problem is controlling the
> contention for the set of drives.   When server A finishes with the first tape
> in a series, and dismounts it, server B is free to grab the drive.   When
> server A proceeds to mount the second tape, it has to now wait for server B to
> finish.   When it does, A grabs it and now B has to wait.   Sort of  like a
> like a dove-tailed joint for you wood-workers.

We have ADSM and non-ADSM applications sharing the 3590s in our 3494.
The trick to making them "tolerate" each other was for our non-ADSM
applications to use techniques similar to what ADSM uses for acquiring
and releasing drives.  Primarily this involves keeping the drive open
during mounts and dismounts to keep another application from thinking
the drive is available.  One of the "windows" we found in our logic was
we were using a simple unload (mt -f /dev/rmtx rewoffl) to release a
drive, but during the time that the volume was being moved from the
drive to its home cell, ADSM could open the drive successfully and then
would fail a mount request because logically the drive was still
"involved" in the dismount process.  This caused intermittent
transaction failures on ADSM.  Now we run a program that opens the
drive, issues the dismount, queries the volume status until it
indicates the volume is back in its home cell, then closes the drive.

> You'll have to make any adjustments based on your schedules, etc, but here is
> what I would recommend:
>
> 1) set the mountwait limit rather high-  This is how long server A will wait
> for server B to finish before A will give up.  The default 60 minutes may be
> insufficient, but keep in mind that the user on the client-end will be waiting
> for this duration.

We have our mountwait limit set to 30 minutes.  Our non-ADSM
applications typically use the drives for a few minutes at a time, then
dismount, but we have an administrator procedure for cancelling the
non-ADSM applications for emergency restores.

> 2) set the mountretention low-  the time server A will be waiting for B to
> finish includes B's idlewait time.  The feature that causes idle mounts to be
> dismounted to make room for new mounts does NOT span servers.

Our mountretention is currently set to 10 minutes.  We've had it set as
low as 1 minute.  Since ADSM is our priority user of the 3494, we wanted
to make sure that ADSM wasn't unnecessarily dismounting and mounting
volumes.  We've also gone from direct-to-tape backups to using a 16 GB
disk pool, so we've dramatically reduced the number of mounts ADSM makes
during a day, with the added benefit of reducing our overall production
backup time because of eliminating mount waits for the clients.

> 3) don't share all 8 drives, but set the mountlimit to something like 4 to 6
> (and not the new default "DRIVES")-  This is because A does not have to wait
> for the same drive that B has grabbed, any open drive will do.  By not sharing
> all 8 drives, you'll lower the contention by keeping  a few mount points free
> to service A when it needs the second tape.

We only have two drives, so we have to share both.  If we had more, I
would try to dedicate one to ADSM, one to the non-ADSM applications
(which can fight among themselves for access), and share the rest.

> I've been making these recommendations for some time, but have not heard much
> from customers who are doing it and how they find sharing to work.   I would
> like to see posts here as to what results they have :-)

We've been sharing drives for over two years.  Initially there were
some problems, but it has worked well enough for long enough that I
often forget that this is "out of the ordinary".  I wish I could feel
that matter-of-fact about hardware reliability on the 3494.

--
Tom Bell                                    tom.bell AT wg.waii DOT com
Tom Bell                                    tom.bell AT wg.waii DOT com
Western Geophysical                         office:     (713) 963-2203
10001 Richmond, Room 2679                   pager:      (713) 415-0419
Houston, TX  77042
<Prev in Thread] Current Thread [Next in Thread>