ADSM-L

Database fragmentation formula (was Re: Online DB Reorg)

2003-10-20 15:15:24
Subject: Database fragmentation formula (was Re: Online DB Reorg)
From: Zlatko Krastev <acit AT ATTGLOBAL DOT NET>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Mon, 20 Oct 2003 22:12:21 +0300
***** ATTENTION *****
Those who do not like the mathematics, skip to the end (search for word
"select").
*********************

>From my mathematical background this query is not showing very good
results. The formula looks like

          ( MAX_REDUCTION_MB * 256 )
100 -  ---------------------------------
        (USABLE_PAGES - USED_PAGES) * 100

Thus closer we are to fully used database, less accurate the formula would
be. Moreover if we fill the DB at 100% the result will be division by 0
while the database might be both fragmented and not fragmented. One of our
goals is to fully utilize our resources. In that exact moment the query
would be useless.
Also the formula is of no use if there is no legend how to interpret the
numbers. For example our test server DB is giving PERCENT_FRAG=26.92 while
being nearly unfragmented.

So I would dare to recommend another formula (in pages):

           used - needed
fragm_p = --------------- x 100
               used

i.e. what space is wasted from all used (in %).
The needed space (in pages) can be found if we multiply PCT_UTILIZED by
USABLE_PAGES and divide by 100 to remove percentages:

needed = PCT_UTILIZED x USABLE_PAGES / 100

while the used space (in pages) is readable from USED_PAGES column.
Therefore my final formula would be (the lines may be split by mailers):

           USED_PAGES - PCT_UTILIZED x USABLE_PAGES / 100
fragm_p = ------------------------------------------------ x 100
                            USED_PAGES

and the final query would be:
select cast(100 * (USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100) -
  / USED_PAGES as decimal(9,5)) as "Unused page parts [pages]" from db

Now the percentage shows the percentage of wasted space vs. used space. 0%
would mean database is fully populated with no holes, 100% are impossible
(as completely empty pages would not be counted, and 99+% mean each page
is filled with something small just to allocate it.

PART 2.
Beyond how much space is wasted inside pages we would be also interested
in how many empty pages we are losing due to partition-allocation scheme.
Again the math first. Same formula (but now in MB):

           used - needed
fragm_p = --------------- x 100
               used

Now needed space is derived from CAPACITY_MB field:

needed = PCT_UTILIZED x CAPACITY_MB / 100

while actual usage is the size to which we can reduce the DB:

used = CAPACITY_MB - MAX_REDUCTION_MB

the resulting formula would look like  (the lines may be split by
mailers):

           (CAPACITY_MB - MAX_REDUCTION_MB) - PCT_UTILIZED x CAPACITY_MB / 100
fragm_p =
--------------------------------------------------------------------- x
100
                            CAPACITY_MB - MAX_REDUCTION_MB

Division by zero cannot happen as TSM server does not allow us to reduce
the DB under one partition.
Now the query for this percentage would be:
select cast(((CAPACITY_MB - MAX_REDUCTION_MB) -
   - (PCT_UTILIZED * CAPACITY_MB / 100) ) -
   / (CAPACITY_MB - MAX_REDUCTION_MB) * 100 -
   as decimal(9,5)) as "Allocation waste [%]" from db

And the final big-big query would look like:
select cast(USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100 -
   as decimal (20,3)) as "Unused page parts [pages]", -
   cast(100 * (USED_PAGES - PCT_UTILIZED * USABLE_PAGES / 100) -
   / USED_PAGES as decimal(9,5)) as "Page fragmentation [%]", -
   cast( (CAPACITY_MB - MAX_REDUCTION_MB) -
   - (PCT_UTILIZED * CAPACITY_MB / 100) as decimal (10,2)) -
   as "Overallocated space [MB]", -
   cast(((CAPACITY_MB - MAX_REDUCTION_MB) -
   - (PCT_UTILIZED * CAPACITY_MB / 100) ) -
   / (CAPACITY_MB - MAX_REDUCTION_MB) * 100 -
   as decimal(9,5)) as "Allocation waste [%]" from db

If someone already invented these formulae, I would congratulate him/her.
Even if I am the first dared to do this hard work, there is no Nobel price
for mathematics :-))

Zlatko Krastev
IT Consultant






Remco Post <r.post AT SARA DOT NL>
Sent by: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
20.10.2003 13:30
Please respond to "ADSM: Dist Stor Manager"


        To:     ADSM-L AT VM.MARIST DOT EDU
        cc:
        Subject:        Re: Online DB Reorg


On Sat, 18 Oct 2003 14:35:09 -0400
"Talafous, John G." <Talafous AT TIMKEN DOT COM> wrote:


> Remco,
>   Would you be willing to share your SQL query that reports on DB
> fragmentation?
>

I was allready looking at Eric (he probably saved my thingy somewhere
usefull, I just saved it in my sent-mail folder), here it is...

select cast((100 - ( cast(MAX_REDUCTION_MB as float) * 256 ) / -
(cast(USABLE_PAGES as float) - cast(USED_PAGES as float) ) * 100) as -
decimal(4,2)) as percent_frag from db

Note that I still think this is one of the more useless queries I've ever
build...

> Thanks to all,
> John G. Talafous              IS Technical Principal
> The Timken Company            Global Software Support
> P.O. Box 6927                 Data Management
> 1835 Dueber Ave. S.W.         Phone: (330)-471-3390
> Canton, Ohio USA  44706-0927  Fax  : (330)-471-4034
> talafous AT timken DOT com           http://www.timken.com


--
Met vriendelijke groeten,

Remco Post

SARA - Reken- en Netwerkdiensten                      http://www.sara.nl
High Performance Computing  Tel. +31 20 592 8008    Fax. +31 20 668 3167

"I really didn't foresee the Internet. But then, neither did the computer
industry. Not that that tells us very much of course - the computer
industry
didn't even foresee that the century was going to end." -- Douglas Adams