ADSM-L

Re: Auditdb timing - FYI

2003-05-27 11:35:26
Subject: Re: Auditdb timing - FYI
From: Gerhard Rentschler <g.rentschler AT RUS.UNI-STUTTGART DOT DE>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 27 May 2003 15:45:37 +0200
Mark,
I don't agree with your answer. Even if Gretchen splits her server into 6
audit will still take 1 day. This is by far too long. There should be more
intelligent means to repair a database than scanning the whole stuff over
and over again.
Regards
Gerhard

---
Gerhard Rentschler            email:g.rentschler AT rus.uni-stuttgart DOT de
Regional Computing Center     tel.   ++49/711/685 5806
University of Stuttgart       fax:   ++49/711/682357
Allmandring 30a
D 70550
Stuttgart
Germany



> -----Original Message-----
> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU]On Behalf Of
> Stapleton, Mark
> Sent: Tuesday, May 27, 2003 3:31 PM
> To: ADSM-L AT VM.MARIST DOT EDU
> Subject: Re: Auditdb timing - FYI
>
>
> From: Gretchen L. Thiele [mailto:gretchen AT PRINCETON DOT EDU]
> > I've been plagued by a few problems when deleting accounts. So far,
> > it seems like Win2K or WinXP clients (SYSTEM OBJECTs are involved
> > again!) are prone to this error:
> >
> > 05/27/2003 09:00:56  ANR2017I Administrator XXXXXX issued command:
> > DELETE FILESPACE ZZZZZZ *
> > 05/27/2003 09:00:56  ANR0984I Process 185 for DELETE FILESPACE started
> > in the BACKGROUND at 09:00:56.
> > 05/27/2003 09:00:56  ANR0800I DELETE FILESPACE * (fsId=6) for node
> > ZZZZZZ started as process 185.
> > 05/27/2003 09:00:56  ANR0802I DELETE FILESPACE * (fsId=6)
> > (backup/archive data) for node ZZZZZZ started.
> > 05/27/2003 09:00:57  ANR0104E imutil.c(7761): Error 2
> > deleting row from table "Expiring.Objects".
> > 05/27/2003 09:00:57  ANR9999D imfsdel.c(1872): ThreadId<25> Error 19
> > deleting group leader 0 176658713.
> >
> > I've tried a number of things - renaming the filespace,
> > moving the node data and then
> > auditing the tape, deleting the filespace specifically - but it's
> > really a database
> > 'corruption' and can only be fixed by an audit (per support).
> >
> > Over the course of the last two weeks, I recovered this database to a
> > test server and
> > ran an audit. Here are the pertinent stats for your reference:
> >
> > Server: H80, 4 way, 2 GB, AIX 4.3.3
> > TSM: v5.1.6.4
> > DB size: 179,544 MB - 74.4% full
> > Log size: 13,280 MB
> > Audit command: dsmserv auditdb fix=yes
> > Audit start: 5/19 09:05
> > Audit end: 5/25 19:15
> > Number of database entries: Processed 1050073565 database entries
> > (cumulative).
> > Elapsed time: 6 days 10 hours 10 minutes
> >
> > The audit was successful and did allow me to delete the problem node.
> > However,
> > there really should be a way to go after the offending entry and blast
> > it (under
> > adult supervision, of course!). I'm not really going to be able to
> > justify a down
> > time of 7 days just to clean up an account. It's now happened again on
> > another
> > server, so I will have to do this test again to get a good estimate of
> > the down time required to clean that server up.
> >
> > I've pushed these accounts 'aside' by renaming them and changing the
> > contact
> > info, but the clients would really like me to remove the data (legal
> > reasons). Having
> > errors like this makes me wonder what else is going on in the
> database.
>
> To tell the truth, the problem lies in the size of your TSM database
> (and probably also the speed of disk upon which said database resides
> and the speed of the processor used by the TSM server). You probably
> ought to consider breaking it up into several separate TSM servers. This
> gives the added ability to export nodes to alternate serves when a
> particular server is giving you fits.
>
> When you grow to enterprise size, you've got to start thinking in
> enterprise mode.
>
> --
> Mark Stapleton (mark.stapleton AT berbee DOT com)
> Berbee Information Networks
> Office 262.521.5627

<Prev in Thread] Current Thread [Next in Thread>