Paul,
~ QUESTION: Does anybody else run their ADSM server on an AIX box and have
~ their database on a non-IBM drive? If so have you seen any problems like
~ the above before, in particular with SEAGATE drives?
There is no reason to use IBM drives only, as well as drives are 100%
functioning.
The advice to use IBM drives may be more than coprporate identity : AIX team
may have tested AIX with some set of hardware, and they surely have tested IBM
drives, while Barracudas were maybe not tested. While most of disks will work
with AIX on IBM computers, there may be some disks, or some revisions of some
disks, which will cause problems under certain circumstances.
While I have no ADSM on AIX, we have some experiency with different hard disks
running on different computers under different OS-es (HP-UX, SUN-OS, NT, OS2).
I run ADSM on OS/2 with mix of Barracuda´s and Quantum ATLAS-es.
The important point is, the COMBINATION of Harddisk, its firmware, cabling,
scsi controller, its firmware, and software driver has to be absolutely
reliable.
Here we experienced few times problems wit certain combinations of the above.
While the systems concerned mostly worked OK, there were some occasional errors
when working for longer periods under heavy load. Most of this problems
appeared in systems using very quick disks - such like Barracudas, but not
always are the disks only were to be blamed.
What can you do?
- there are possibilities in AIX to watch soft errors on your disks. If you
have many - or some at all - soft errors on your disks, you may want to tune
your hardware. Check cabling, use shorter SCSI cables (even if you are
currently under the length limit), if the errors are related to single drive
only then replace the drive, ... Check if your Barracudes are cooled properly.
An overheated Barracuda will produce occasional errors.
- we believe most occasional problems we ever had with quick disks raised from
improper functioning of multiple command queueing. This is the possibility for
scsi controler to send more commands (thus command queue) to hard disk, and the
hard disk can then execute them in an optimised way. We were obviously able to
get rid of the problems by installing newer firmware on either disk or disk
controller, or by updating the device driver (although I do not really
understand the influence of the latest - it helped some times).
- If you already have newest firm- and software, try to down-tune your disks.
Shut the commang queueing on your disks off. If it does not help,
decrease the maximum transfer speed between your controller and your dsik. If
it does not help.
shut the cache on your disks off.
While this will slow your system particulary down, it will help you to
determine what your problem is.
- Think about purchasing RAID subsystem for your database.
~ * The database now occupies 2xs the amount of space it did previously!!
Sorry, I have no Idea what this could be caused by.
Juraj SALAK, Keba Banking, Linz, Austria
----------
~ Von: P.A Walmsley
~ An: Multiple recipients of list ADSM-L
~ Betreff: Database corruption??
~ Datum: Donnerstag, 12. Dezember 1996 11:57
~
~ I am running AIX4.1.4, ADSM server code 2.1.0.8 and various UNIX
~ clients.
~
~ I have been getting a number of page address mismatch errors on the
~ database from the ADSM server
~ e.g.
~
~ logical page 9208 (physical page 9464); actual 19191165991
~
~ resulting in the server terminating with an internal error BUF011.
~
~ I have a call currently open at the moment with IBM about it. Their
~ advice is to move the database to an IBM disc drive. It is currently on
~ a SEAGATE ST15230N. There are no indications in the error log of any
~ errors from the 3rd party disc. I have been talking with SEAGATE over
~ this and they confirmed that the firmware revision I have on the drive
~ is the latest for this particular drive.
~
~ QUESTION: Does anybody else run their ADSM server on an AIX box and have
~ their database on a non-IBM drive? If so have you seen any problems like
~ the above before, in particular with SEAGATE drives?
~
~
~ For your information I have done a dumpdb followed by a load/audit to
~ try and fix the problem. It gave some interesting 'features'
~
~ * To load my database which is almost 500MB in size I had to define a
~ space TWICE the size in order to get it to work.
~ * The database now occupies 2xs the amount of space it did previously!!
~ * The loaddb/audit eventually worked, did not use detailed=yes on audit,
~ but the number of read errors reported now is actually worse!!
~
~ Thanks in advance,
~
~ Paul Walmsley
~
~
~
~
~
~
~ logical read
~
------ =_NextPart_000_01BBE8DF.EBA1EF70--
=======================================================================
|