ADSM-L

Re: ADSM, AIX, 3494, and 3590s

1997-04-11 18:06:08
Subject: Re: ADSM, AIX, 3494, and 3590s
From: "Pittson, Timothy ,HiServ/US" <tpittson AT HIMAIL.HCC DOT COM>
Date: Fri, 11 Apr 1997 18:06:08 -0400
Kent,
        Regarding #3, we were having some problems with the 3590 drives giving
us what we think was bogus error messages (FID1 E5 and some other error
codes)... sometimes this would make the drive unavailable, other times
they would continue to work.  We upgraded the microcode on the tape
drives about 2 weeks ago and everything's been fine since.  The
microcode level is DOI7_29F.

Tim Pittson
tpittson AT himail.hcc DOT com

>----------
>From:  Kent L. Johnson[SMTP:johnsk6 AT RPI DOT EDU]
>Sent:  Friday, April 11, 1997 4:27 PM
>To:    ADSM-L AT VM.MARIST DOT EDU
>Subject:       ADSM, AIX, 3494, and 3590s
>
>I'm currently running ADSM v2.1.0.12 on an R40 running AIX 4.1.5 ethernet
>connected to a 3494 with 3 SCSI-connected 3590 tape drives.
>
>Here are my experiences of today.
>
>1) I have a 'backup db' command scheduled for 10:30 every morning.  Today,
>around 1:20 I noticed that the backup command was still waiting for a tape
>drive.  At this time, the db log was about 80% full too, since the db is
>locked while the backup db command is waiting for a tape.
>
>2) I cancelled the backup command and searched for problems in ADSM.  I
>bounced the lmcpd daemon, since sometimes this clears up problems, and ADSM
>is able to mount tapes again.  I also started an 'audit library' and tried to
>execute the 'backup db' command again.
>
>3) The backup processes was still waiting to mount a tape.  Next I went to
>the 3494 and found that one of our 3590 drives was dead.  The 3590 LCD
>display was blank.  So, I made the drive unavailable at the 3494, in hopes
>that ADSM would detect this, and would be able to continue successfully.
>
>4) The 'backup db' process was still not able to mount a tape.  So, I tried a
>'delete drive' in ADSM on the offending tape drive.  I still didn't see any
>progress with the 'backup db' process.  So, I tried cancelling the 'audit
>library' and the 'backup db' processes.  Now the cancel for these processes
>was pending.  I waited a while, and then called software support.  I also
>entered a 'show mp' command, and noticed that there were two mount points
>shown, even though a 'query mount' showed only one tape mounted.
>
>5) While I was on hold with software support, I tried deleting the device
>file (/dev/rmt3) for the offending tape drive.  This seemed to allow the
>'cancel process' commands to complete.  After this, ADSM was able to
>successfully complete the 'backup db' command.
>
>So, here are some of my questions.
>
>1) Have other people seen 3590 hardware problems cause ADSM to lose the
>ability to interact with the 3494 and other functioning tape drives?  I've
>seen these types of problems a number of times in the past.
>
>2) Does anybody have any better way to deal with these types of problems?
>
>3) How reliable are other 3590s?  Most of our problems have been with one of
>our three 3590 drives.
>
>4) Have these types of problems been reported to ADSM development, so that
>they are looking into them?  The ADSM support person with whom I spoke today
>gave a vague indication that ADSM development is working on these types of
>problems.  Are there any PMRs which development is actively working on?
>
>These problems are serious, because an undetected 'backup db' process which
>is unable to complete will lock the database, then the log will fill, then
>ADSM will freeze up.  Then loss of backup data, maybe database corruption.
> Then maybe 'restore db', or even 'audit db'.  It gets ugly.
>
>I would love to hear from ADSM development that (1) these problems are known
>and solvable and that (2) we can expect to see these problems resolved in a
>not-too-distant ADSM release.
>
>- Kent
>
>
>--
>Kent Johnson                        Internet: johnsk6 AT rpi DOT edu
>Unix Systems Programmer (VCC 323)      Phone: (518) 276-8175
>Rensselaer Polytechnic Institute         Fax: (518) 276-2809
>
<Prev in Thread] Current Thread [Next in Thread>