ADSM-L

Re: [ADSM-L] Dealing with defunct filespaces.

2007-07-13 10:15:53
Subject: Re: [ADSM-L] Dealing with defunct filespaces.
From: Lawrence Clark <Larry_Clark AT THRUWAY.STATE.NY DOT US>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 13 Jul 2007 10:01:23 -0400
 a q node would show those clients that have not accessed the system in
some time.

>>> shawn.drew AT AMERICAS.BNPPARIBAS DOT COM 07/13/2007 10:47:29 AM >>>
You can run write a daily script that performs your select statement,
and
creates a report and a macro for deleting all the offending file
systems.
Then, if all looks good for deletion, just run the macro.  Not too
much
effort.
-Shawn




Internet
adsm AT CAROUSEL.ITS.MONASH.EDU DOT AU

Sent by: ADSM-L AT VM.MARIST DOT EDU
07/13/2007 02:34 AM
Please respond to
ADSM-L AT VM.MARIST DOT EDU


To
ADSM-L
cc

Subject
[ADSM-L] Dealing with defunct filespaces.






Hi all.

Whilst investigating something else, we discovered a number of nodes
that have old filespaces still stored within TSM - eg:

                                       Node Name: (node name)
                                  Filespace Name: /data
                      Hexadecimal Filespace Name:
                                            FSID: 4
                                        Platform: SUN SOLARIS
                                  Filespace Type: UFS
                           Is Filespace Unicode?: No
                                   Capacity (MB): 129,733.3
                                        Pct Util: 92.1
                     Last Backup Start Date/Time: 06/09/05   20:03:56
                  Days Since Last Backup Started: 764
                Last Backup Completion Date/Time: 06/09/05   20:05:16
                Days Since Last Backup Completed: 764
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:

                                       Node Name: (node name)
                                  Filespace Name: /Z/oracle
                      Hexadecimal Filespace Name:
                                            FSID: 12
                                        Platform: SUN SOLARIS
                                  Filespace Type: UFS
                           Is Filespace Unicode?: No
                                   Capacity (MB): 119,642.2
                                        Pct Util: 31.5
                     Last Backup Start Date/Time: 08/26/05   01:03:08
                  Days Since Last Backup Started: 686
                Last Backup Completion Date/Time: 08/26/05   01:14:01
                Days Since Last Backup Completed: 686
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:

                                       Node Name: (node name)
                                  Filespace Name: /mnt
                      Hexadecimal Filespace Name:
                                            FSID: 15
                                        Platform: SUN SOLARIS
                                  Filespace Type: UFS
                           Is Filespace Unicode?: No
                                   Capacity (MB): 120,992.9
                                        Pct Util: 55.8
                     Last Backup Start Date/Time: 01/26/06   20:05:15
                  Days Since Last Backup Started: 533
                Last Backup Completion Date/Time: 01/26/06   20:06:34
                Days Since Last Backup Completed: 533
Last Full NAS Image Backup Completion Date/Time:
Days Since Last Full NAS Image Backup Completed:


These are all filesystems which existed at some time in the past, but
which were removed as part of an application upgrade (or system
rebuild, or ...), and hence no longer exist. It seems that TSM is
taking the attitude of "if I can't see the filesystem, I'll not do
anything about marking files in that filesystem inactive", so the
data never expires. I can understand the reasoning behind this
approach, but it does mean that there's a large amount of data
floating around that is no longer needed (a quick and dirty estimate
says around 83 TB across primary and copy pools, although some of
that needs to stay).

A delete filespace will clear them up quickly, obviously, but there's
a twist: how can we identify filesystems like this, short of going
around to each client node and doing a df or equivalent? Searching
the filespaces table gives us some 600 filespaces all up; I *know*
that several of these have to stay - eg, image backups don't update
the backup_end timestamp, and there are some filespaces that are
backed up exclusively with image backups.

At the moment, the best I can come up with is to:
   * use a SELECT statement on the filespaces table to get a "first
cut" (select node_name, filespace_name, filespace_id from filespaces
where backup_end < current_timestamp - N days);
   * use QUERY OCCUPANCY on each of the filespaces mentioned in the
first cut; if the total occupied space is below some threshold,
ignore it as not being worth the effort;
   * use a SELECT statement on the backups table to confirm that no
backups have come through in the past N days. (select 1 from db where
exists (select object_id from backups where node_name=whatever and
filespace_id=whatever and state=ACTIVE_VERSION and current_timestamp
< backup_date+90 days) -- I use exists to try to minimise the effort
TSM needs to put into the query; I also have the active_version check
in there for the same reason (if there's only inactive versions,
they'll drop off the radar anyway in due course). Hopefully TSM's SQL
execution is optimised to stop in this case when it finds one match
rather than trying to find all matches ...)

Does anybody have any better ideas? Unfortunately, because of the
nature of Monash's organisation, simply having central policies
saying "you must do X when shuffling filesystems around" won't cut it
(and let's be honest here - how many sysadmins are likely to remember
such policies, given how infrequent such moves are?)

Yes, I have a call open with IBM support about this. :-) If there's
sufficient interest, I can summarise their eventual response to the
mailing list (so far, it's mostly been clarification of the call, and
a few pointers that match with what we've already done.)

Thanks,

Stuart.



This message and any attachments (the "message") is
intended solely for the addressees and is confidential.
If you receive this message in error, please delete it and
immediately notify the sender. Any use not in accord with
its purpose, any dissemination or disclosure, either whole
or partial, is prohibited except formal approval. The internet
can not guarantee the integrity of this message.
BNP PARIBAS (and its subsidiaries) shall (will) not
therefore be liable for the message if modified.

                ---------------------------------------------

Ce message et toutes les pieces jointes (ci-apres le
"message") sont etablis a l'intention exclusive de ses
destinataires et sont confidentiels. Si vous recevez ce
message par erreur, merci de le detruire et d'en avertir
immediatement l'expediteur. Toute utilisation de ce
message non conforme a sa destination, toute diffusion
ou toute publication, totale ou partielle, est interdite, sauf
autorisation expresse. L'internet ne permettant pas
d'assurer l'integrite de ce message, BNP PARIBAS (et ses
filiales) decline(nt) toute responsabilite au titre de ce
message, dans l'hypothese ou il aurait ete modifie.


The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain information that is confidential, privileged, and/or otherwise exempt 
from disclosure under applicable law.  If this electronic message is from an 
attorney or someone in the Legal Department, it may also contain confidential 
attorney-client communications which may be privileged and protected from 
disclosure.  If you are not the intended recipient, be advised that you have 
received this message in error and that any use, dissemination, forwarding, 
printing, or copying is strictly prohibited.  Please notify the New York State 
Thruway Authority immediately by either responding to this e-mail or calling 
(518) 436-2700, and destroy all copies of this message and any attachments.