ADSM-L

Re: DB volume layout (was Re: Database sizing question)

2000-08-25 14:21:19
Subject: Re: DB volume layout (was Re: Database sizing question)
From: Tab Trepagnier <Tab.Trepagnier AT LAITRAM DOT COM>
Date: Fri, 25 Aug 2000 13:16:36 -0500
Geoff,

The one logical file per physical device is news to me.

Our DB is 21.5 GB + space for future expandion.  It consists entirely of 1000 MB
volumes.   Our log consists entirely of 200 MB volumes.   We have two SCSI
controllers hosting the DB and log drives.  One controller hosts four disks that
carry the primary volumes for the DB and log.  The other controller hosts four
disks that carry the secondary volumes.


 We had similar performance issues.  Reducing the size of the DB volumes (they
had been 1.5 - 1.8 GB each) was one thing I tried to reduce the hang time for
big jobs like that.  I also reduced log volumes to 200 MB each from 600 MB each
and disk pool volumes to 500 MB from 1000-1500 MB each.  Our AIX ADSM server has
512 MB RAM.  Because (disclaimer:  in my understanding) AIX directly maps memory
pages to file pages, large files cause a lot of thrashing as you describe.  Open
a huge file on your PC and you will see the same phenomenon.  That is why I
reduced the individual volume sizes.

Further into my performance tuning I saw symptoms that AIX was paging out the
memory pages of the ADSM DB buffer pool.  This is a guaranteed performance
killer.  I've shared VM tuning suggestions to the forum over the last few weeks
drawn from those experiences.

In a nutshell, it is tempting to set a fairly large DB buffer pool expecting the
DB pages to be reused in RAM rather than read from disk.  But default AIX VM
tuning will have the system paging to/from disk with a surprisingly small DB
buffer pool.  On our 512 MB system, the VM paging went berserk at any buffer
pool setting higher than about 48 MB!

The trick is to set the AIX VM "maxperm" setting to (1 -  ( (DB Buf pool  + 80
MB) / Total RAM))  x 100.  For example, in our case, I set a DB buffer pool of
96 MB.  So I set maxperm to ( 1 - ((96 + 80) / 512)) x 100 = 65.625 %.  That
allows ADSM, its DB buffer pool, and various active AIX processes to maintain
enough RAM outside the preferential paging space to avoid it being paged in and
out.

The result I've gotten is that hangs when sessions and processes start have
totally gone away.  Using vmtune to monitor memory activity shows that a new
session occasionally causes ONE page to swap to/from disk.  Before my VM tuning,
paging would exceed 200 / second, and it might be a minute before the system
responded to anything new.

Other things I did were raise "minfree" from the default of 120 to 256.  That
provides more "cushion" as AIX maps memory pages to file.  Should it slightly
overcommit, it won't bottom out and provoke extreme paging.  120 pages is just
480 KB.

Finally, setting the "maxpageahead" to 64 allows the file system to read ahead
when doing sequential reads of the DB.  This is especially relevant during
Expire Inventory and DB backups.  Having a fairly large DB buffer pool helps
even more.

When I started all this, expire inventory took 18+ hours and the system hung for
at least a minute on every new session or process.  Now, with a DB 20% larger,
expire inventory takes 6 hours, and new sessions and processes launch
instantaneously.

Good luck.

Tab











Geoff Allen <geoff AT WSU DOT EDU> on 08/25/2000 12:18:25 PM

Please respond to "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>

To:   ADSM-L AT VM.MARIST DOT EDU
cc:    (bcc: Tab Trepagnier/Corporation/Laitram/US)
Subject:  DB volume layout (was Re: Database sizing question)




"Cook, Dwight E" <cookde AT BP DOT COM> writes:

> due to the way adsm/tsm internally manages "stuff" it is best to have one
> logical file/device per physical device.

I've currently been trying to find bottlenecks in our database. When
we do a restore from a rather large filesystem (about 25,000 home
directories at the top level), our ADSM server thrashes the db disks
for a long time (20-40 minutes). During this time, the client appears
hung (won't redraw in X, etc.). Eventually, the server finishes
building its list, and the restore completes quickly (appears limited
only by the ability of the tape library to mount and unmount tapes).

There is no problem with restores on smaller filesystems. These
proceed quite quickly.

This message jumped out at me, becuase we have multiple db volumes on
each disk. We've just been adding new volumes in smaller chunks as
we've needed more db space. Have we been shooting ourselves in the
foot? Would we be likely to see much of a performance improvement by
moving to one db volume per physical disk?

> if you want to get really anal you can start splitting things (db, log, &
> data files) across adaptors, making db & log files the first device out on
> an ssa loop, then put the db & log mirror files out on a different adapter
> (on maybe a different buss within the processor)
> ok, I'll shut up now ;-)

How much benefit can be derived by throwing SCSI controllers and/or
disks at the problem? We're looking "down the road" at a
replacement for the current ADSM server, and this is one of the
questions -- how much SCSI do we need?

The traditional answer of throwing lots of disks at the problem seems
rather silly these days, when a small disk is 9GB and you have a 5GB
database.

Geoff

--
Geoff Allen, geoff AT wsu DOT edu, <http://www.wsu.edu/~geoff/>
Geoff Allen, geoff AT wsu DOT edu, <http://www.wsu.edu/~geoff/>

It's happy music for happy people. If you like bluegrass you gotta be in
a good mood.  -- Dave Winer
<Prev in Thread] Current Thread [Next in Thread>