ADSM-L

Re: Audit DB times

1996-01-25 13:07:30
Subject: Re: Audit DB times
From: Helmut Richter <Helmut.Richter AT LRZ-MUENCHEN DOT DE>
Date: Thu, 25 Jan 1996 19:07:30 +0100
On Thu, 25 Jan 1996, Simon Travaglia wrote:

> An AUDITDB of this takes DAYS and DAYS to run.  And days.
>
> It's running on an RS6000 C10 with 128Mbyte (very little free) and
> appears to have strange usages in terms of cpu during the audit
>
> It started off quickly, got to 3 million entries quite fast.
> From 3 million to 7 million very slow
> From  7 million to 12 million now pretty fast.

You are pretty lucky that things are speeding up again. We saw the same
problem with 40 million entries total: 13 million quite fast (1 day), 1
more million very slow (5 weeks), then given up. After 5 weeks wall clock
time and 1.5 million secs CPU time we decided that it would virtually
never terminate.

Mid-October we opened a PMR requesting IBM to analyze the reason for the
failure of the auditdb run to terminate. Immediately after opening and
prior to having them looked at the data, we learned [quote]:

  1 - The estimate of 7 years is incorrect.  I understand what you are
  using to make that calculation, but the calculation is incorrect.  You
  can not go by the number of entries audited versus the number of
  entries loaded to get a time estimate.  From the point where we were
  auditing the audit was about 1/2 - 2/3 complete.

  2 - The customer does need a back up strategy for the data base.

  3 - There will be no changes made to AUDIT DB in the short term that
  will have any significant impact on the time it takes to run unless the
  log shows a correctable problem.

Item 1 is fascinating: for five weeks they haven't been able to make any
predictions when it would or could possibly terminate, and now they
suddenly know. Item 2 is incomprehensible: we have made backups, otherwise
we wouldn't have been able to restore and thus be forced to audit. Item 3
is depressing: the problem is not taken seriously and you have to live
with the risk of loosing all your data if something goes wrong with the
database.

Shortly after giving these answers they received a tape with the corrupt
database. This is the state of the analysis as of now.

> Is truckloads more memory [always a problem with this machine] going to help
> me?

We have not had the impression that memory shortage had been a cause of
the problem.

> Any tips for next time would be cool.

Only tip: avoid the next time. Auditdb may work for small test cases, it
simply does not work on a production-size database.

You might be interested in how we got the thing fixed. We decided to
discard most of the data and exported only the most valuable data to tape
(which worked although the database was corrupted). We then started from
scratch with an empty server and imported the data. We are aware that we
are sitting on a time bomb ready to explode at any time.

Sorry for the bad news. Best wishes for the future,

Helmut Richter

 ============================================================================
Dr. Helmut Richter
Leibniz-Rechenzentrum     X.400:  S=Richter;OU=lrz;P=lrz-muenchen;A=d400;C=de
Barer Str. 21            RFC822:  Helmut.Richter AT lrz-muenchen DOT de
D-80333 Muenchen           Tel.:  ++49-89-2105-8785
Germany                     Fax:  ++49-89-2809460
 ============================================================================
<Prev in Thread] Current Thread [Next in Thread>
  • Audit DB times, Simon Travaglia
    • Re: Audit DB times, Helmut Richter <=