Wanda,
We're in a similar situation to Eric, and we were planning to do pretty much
what you suggested. We have some (apparently) benign corruption that we would
like to clean up before migrating the db to v6. Our DB is much larger than
Eric's, I think, because the audit is measured in days, not hours. I was
intended to use a technique similar to what you describe, but when we ran a
test of the server-to-server export/import process, it took too long to be
viable. Our plan had more steps than yours.
At the time the db is cloned (to the test server), all storage pools would be
marked readonly. New temporary storage pools would be used to handle newly
ingested data (during your step 2). This way, when step 3 happens and the db
on test is used as production, all of the db entries are still valid for the
readonly storage pools. And, all of the newly ingested data would be isolated
and could (in theory anyway) be exported more quickly (since the temp storage
pools would be on non-collocated tape and as much spare disk as we could have
found). Furthermore, I was planning to NOT switch test into production use
until after the export/imports could catch up and transfer all of the newly
ingested data to test. I thought that would have been cleaner to end users
trying to do restores during the transition period (no worries about what data
had or had not yet been exported). But the problem we ran into is that ingest
rate was faster than the export rate, so we could never catch up.
Long and short of it, is that I don't think there's a way to avoid an extended
outage, at least in our situation. Perhaps with a smaller db and a lower daily
ingest rate.
..Paul
At 10:03 AM 11/18/2010, Prather, Wanda wrote:
>Audit DB is notoriously slow. Even if you improve performance by 20%, you'll
>still have a very long down time.
>Here's a different idea to think about:
>
>1) perform the audit db on the test server, take as long as you need.
>2) Let your clients continue backing up as usual to production
>3) when the db on test ready, bring up that TSM and swap ip addresses with
>prod/test, so your clients are now backing up to test
>4) set up server-to-server communications
>5) export node server-to-server from oldprod to test, using fromdate-fromtime
>todate-totime merge=yes to pick up anything that you missed in that 17-hour+
>window
>
>Test is now prod with a clean db.
>Anybody think of a reason that won't work?
>
>W
>
>
>
>
>-----Original Message-----
>From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf
>Of Loon, EJ van - SPLXO
>Sent: Thursday, November 18, 2010 5:44 AM
>To: ADSM-L AT VM.MARIST DOT EDU
>Subject: [ADSM-L] Database audit performance
>
>Hi TSM-ers!
>We're having orphaned database entries, caused by a very old bug, fixed
>some server releases ago, but only recently discovered. I'm currently
>trying to find a way to speed-up the auditdb performance.
>What I'm planning to do is this:
>1) backup the database on our production server
>2) stop the production server
>3) restore the production database on our test server which already used
>new disks, allocated on our new Vmax.
>4) perform an audit fix=yes on this database
>5) backup the fixed database and restore it on the production server
>I already tested the scenario above and it works, but the audit takes
>too long to finish (17 hours). Since we're backing up a lot of Oracle
>databases, TSM downtime will be too long, the Oracle recovery logs will
>fill up and the databases will stop.
>We are running an AIX TSM server with plenty of memory and multiple HBA
>to the SAN.
>Restoring the database runs ok, Topas is showing around 25 Mb/sec disk
>write speed. I have seen better performance on Vmax disks, but I can
>live with this.
>When I start the audit Topas shows a disk read and write speed average
>less than 1 Mb./sec. CPU average is around 50% and vmstat shows no page
>in and out.
>I tried everything: mounting the filespace with cio, dio, using RAW
>logical volumes, tuning read ahead through ioo, it doesn't make any
>difference or even gets worse (when using RAW for instance).
>I'm really out of options here. Something is holding back the audit, but
>I can't find what!
>Does anybody have some tips for me?
>Thank you VERY much in advance!
>Kind regards,
>Eric van Loon
>KLM Royal Dutch Airlines
></pre>********************************************************<br>For
>information, services and offers, please visit our web site:
>http://www.klm.com. This e-mail and any attachment may contain confidential
>and privileged material intended for the addressee only. If you are not the
>addressee, you are notified that no part of the e-mail or any attachment may
>be disclosed, copied or distributed, and that any other action related to this
>e-mail or attachment is strictly prohibited, and may be unlawful. If you have
>received this e-mail by error, please notify the sender immediately by return
>e-mail, and delete this message.<br><br>Koninklijke Luchtvaart Maatschappij NV
>(KLM), its subsidiaries and/or its employees shall not be liable for the
>incorrect or incomplete transmission of this e-mail or any attachments, nor
>responsible for any delay in receipt.<br>Koninklijke Luchtvaart Maatschappij
>N.V. (also known as KLM Royal Dutch Airlines) is registered in Amstelveen, The
>Netherlands, with registered number 33014286
><br>********************************************************<pre>
--
Paul Zarnowski Ph: 607-255-4757
Manager, Storage Services Fx: 607-255-8521
719 Rhodes Hall, Ithaca, NY 14853-3801 Em: psz1 AT cornell DOT edu
|