ADSM-L

Re: AW: Database corruption??

1996-12-13 16:15:53
Subject: Re: AW: Database corruption??
From: Chester Martel <cmartel AT PEACH.FRUIT DOT COM>
Date: Fri, 13 Dec 1996 15:15:53 -0600
Sal Salak Juraj wrote:
>
> Paul,
>
> ~ QUESTION: Does anybody else run their ADSM server on an AIX box and =
> have
> ~ their database on a non-IBM drive? If so have you seen any problems =
> like
> ~ the above before, in particular with SEAGATE drives?
>
> There is no reason to use IBM drives only, as well as drives are 100%
> functioning.
> The advice to use IBM drives may be more than coprporate identity : AIX
> team may have tested AIX with some set of hardware, and they surely have
> tested IBM drives, while Barracudas were maybe not tested. While most of
> disks will work with AIX on IBM computers, there may be some disks, or
> some revisions of some disks, which will cause problems under certain
> circumstances.
>
> While I have no ADSM on AIX, we have some experiency with different hard
> disks running on different computers under different OS-es (HP-UX,
> SUN-OS, NT, OS2).
> I run ADSM on OS/2 with mix of Barracuda=B4s and Quantum ATLAS-es.
>
> The important  point is, the COMBINATION of Harddisk, its firmware,
> cabling, scsi controller, its firmware, and software driver has to be
> absolutely reliable.
> Here we experienced few times problems wit certain combinations of the
> above. While the systems concerned mostly worked OK, there were some
> occasional errors when working for longer periods under heavy load. Most
> of this problems appeared in systems using very quick disks - such like
> Barracudas, but not always are the disks only were  to be blamed.
>
> What can you do?
>
>  - there are possibilities in alll flavours of Unix to watch soft errors on 
> your disks. If
> you have many - or some at all - soft errors on your disks, you may want
> to tune your hardware. Check cabling, use shorter SCSI cables (even if
> you are currently under the length limit), if the errors are related to
> single drive only then replace the drive, ... Check if your Barracudes
> are cooled properly. An overheated Barracuda will produce occasional
> errors.
>
>  - we believe most occasional problems we ever had with quick disks
> raised from improper functioning of multiple command queueing. This is
> the possibility for scsi controler to send more commands (thus command
> queue) to hard disk, and the hard disk can then execute them in an
> optimised way. We were obviously able to get rid of the problems by
> installing newer firmware on either disk or disk controller, or by
> updating the device driver (although I do not really understand the
> influence of the latest - it helped some times).
>
>  - If you already have newest firm- and software, try to down-tune your
> disks.
>  Shut the commang queueing on your disks off. If it does not help,
>  decrease the maximum transfer speed between your controller and your
> dsik. If it does not help.
>  shut the cache on your disks off.
>
> While this will slow your system particulary down, it will help you to
> determine what your problem is.
>
>  - Think about purchasing RAID subsystem for your database.
>
> ~ * The database now occupies 2xs the amount of space it did
> previously!!
> Sorry, I have no Idea what this could be caused by.
>
> Juraj SALAK, Keba Banking, Linz, Austria
>
>  ----------
> ~ Von: P.A Walmsley
> ~ An: Multiple recipients of list ADSM-L
> ~ Betreff: Database corruption??
> ~ Datum: Donnerstag, 12. Dezember 1996 11:57
> ~
> ~ I am running AIX4.1.4, ADSM server code 2.1.0.8 and various UNIX
> ~ clients.
> ~
> ~ I have been getting a number of page address mismatch errors on the
> ~ database from the ADSM server
> ~ e.g.
> ~
> ~    logical page 9208 (physical page 9464); actual 19191165991
> ~
> ~ resulting in the server terminating with an internal error BUF011.
> ~
> ~ I have a call currently open at the moment with IBM about it. Their
> ~ advice is to move the database to an IBM disc drive. It is currently on
> ~ a SEAGATE ST15230N. There are no indications in the error log of any
> ~ errors from the 3rd party disc. I have been talking with SEAGATE over
> ~ this and they confirmed that the firmware revision I have on the drive
> ~ is the latest for this particular drive.
> ~
> ~ QUESTION: Does anybody else run their ADSM server on an AIX box and have
> ~ their database on a non-IBM drive? If so have you seen any problems like
> ~ the above before, in particular with SEAGATE drives?
> ~
> ~
> ~ For your information I have done a dumpdb followed by a load/audit to
> ~ try and fix the problem. It gave some interesting 'features'
> ~
> ~ * To load my database which is almost 500MB in size I had to define a
> ~ space TWICE the size in order to get it to work.
> ~ * The database now occupies 2xs the amount of space it did previously!!
> ~ * The loaddb/audit eventually worked, did not use detailed=yes on audit,
> ~ but the number of  read errors reported now is actually worse!!
> ~
> ~ Thanks in advance,
> ~
> ~ Paul Walmsley
> ~
> ~
> ~
> ~
> ~
> ~
> ~ logical read
> ~I run my server and client databases on an Amdahl/Clariion/Data General
raid 5 device and I am not having any problems after all the drives
were replaced with Seagate drives.  There was a microcode deficiency
on the original drives manufactured by Quantum.
<Prev in Thread] Current Thread [Next in Thread>