ADSM-L

Re: [ADSM-L] GPFS file system backup problem

2008-01-30 10:16:29
Subject: Re: [ADSM-L] GPFS file system backup problem
From: "Allen S. Rout" <asr AT UFL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 30 Jan 2008 10:16:02 -0500
>> On Tue, 29 Jan 2008 12:57:30 +0200, Michael Green <mishagreen AT GMAIL DOT 
>> COM> said:

> Bright minds,

> Some time ago a problem has arisen with one of the GPFS file systems
> that I happen to backup.

> Basically what happens is that the backup of that particular file
> system never completes,  cuts short with return code 12.

> 01/29/08   02:29:28 Normal File-->       684,135,874
> /srv/databases/unigeneU/Hs.retired.lst [Sent]
> 01/29/08   02:29:28 ANS1999E Incremental processing of '/srv' stopped.
> 01/29/08   02:29:28 --- SCHEDULEREC STATUS BEGIN


Is there an errorlog entry at "about" that time?  It could be offset
by as much as a few minutes.

I have seen a variety of situations I group under "insane metadata"
which cause TSM to throw up its' hands on a given filesystem.
Examples include negative MTIME, MTIME >MAXINT, (a badly behaved
alternate fileystem mount on a NETAPP) Corrupt regions of the
directory INODE (we think those are SAN-blip corruption), etc.

Often there's a particular error in the errorlog, and if you go ls -l
around in the place where it failed you find something which is odd.
Sorry to have nothing more specific than "odd", that's as precisely as
I can group the cases I've seen.

- Allen S. Rout