Looking up deleted files.

ebisoba

ADSM.ORG Member
Joined
Jul 8, 2009
Messages
33
Reaction score
0
Points
0
Just wondering if it's possible to look up only deleted files stored on TSM? If yes what command should I use. Thanks.
 
What are you trying to accomplish by showing deleted files? Do you need it for a single server, single drive, single folder.... or for your entire envirionment?

As far as I know there isn't a command to show you deleted files.
 
I need it for the entire TSM environment. Currently we have a policy to keep deleted files for 7 years. We are wondering if it's possible to look up and move deleted files to tape and send it offsite. That way we can get rid of deleted files from the TSM storage pool.
 
You should look into TSM HSM for that.

I can see how a 7 year policy would be brutal and is almost insane. I understand the need for some inactive data to be retained for x number of years but that should be fairly limited and not handled by backups. Why keep everything for so long?

We retain our deleted files for 42 days. Our policy is that TSM is for disaster recovery only. It is not to be used as a document management system for all business products.

"IT does not control the retention policy or rules over business data. Business data retention and recovery requirements should be developed by functional areas, within any guidelines published by Corporate Legal. If a business area wants to enforce the retention of multiple versions of files technically rather than with their own processes and naming conventions, IT is prepared to propose technical solutions for doing so. These solutions will require changes to processes and technology, and, depending on the requirements, could involve additional cost. For documents stored using simple file shares, IT Infrastructure will continue to support the current practice in which data assets are backed up and remain available as long as those with write access to the information do not delete or modify the original document. In addition, files can be restored within a 42 day window of when they are modified or removed by notifying the Support Center. "
 
I think you can accomplish something like that with mountablenotinlib and an active data pool.

But I did not think very hard ..
 
Unfortunately identifying these files will be a nasty task.

The data you want is in the "backups" table. That table stores the state of an object (active/inactive) and the deactivate date. The problem is that, whilst the "active" state represents data which existed on the client at the time of the last backup, the "inactive" state represents both existing files with newer backup versions plus deleted files.

You might be best off selecting out the entire backups table to a CSV file which you then import into a seperate database (postgres/mysql etc) and then run your SQL. For a start you can create additional indices to speed up your select...

Please don't run this against a production TSM server unless you'd like your select statement to run for a month and cause some chaos while it does so!

You're going to want to run something like...

Code:
select node_name, hl_name, ll_name
from backups
where
object_id not in (select object_id from backups where state='ACTIVE')
Disclaimer: I'm not sure whether the "object_id" is unique to a node, or across the whole TSM instance...can't run a test today, sorry about that.

That having been said...the select above can't actually be used to selectively (hur hur) move data around. It seems like you want to implement a long-term archive of some sort for ancient data. You might like to look at the MigDelay stgpool parameter...you'll need to tweak things a bit to ensure that all objects on a particular volume are capable of being migrated, but its probably do-able.

As Jeff has mentioned backup isn't really the way to go for LT archive. You might like to try archiving the data (if there is a procedural option available to execute the archive immediately before the purge).

Cheers,

T
 
Versions Data Deleted totals

I am too in this predicament, where I am trying to set new company policies that will most efficiently store our backup data, while not keep junk we don't need. So...

I need to know how I can find out the DELETED files kept in TSM , but NOT expired data. In my case, I have this management class to keep the last copy of the file forever, and 1 version of a deleted file. So in my case, it will keep 1 version of the deleted file forever, right?

Can someone provide a script to query how many of these DELETED files are stored in TSM, and I suspect they are not marked as expired.

Rod
 
If you are setting a new policy you need to forget the idea of storing all deleted documents forever. That is simply not a reasonable request. There is no reason to retain every deleted file forever. I hope that's not the default!

You need to set a reasonable time-to-live for purged data. I understand that some data has a legal requirement for long term retention but that data doesn't extend to entire servers. Ususally it's a specific database (or the like). That type of data needs to be identified and assigned its own managment class.

Your default management class needs to expire data via a realistic rule. In my opinion it shouldn't extend beyond 60 days. As mentioned above our default is 16/42/16/42 and I feel that is excessive. At least all deleted or versioned data is gone in 42 days. Which is give a client 16 versions or if its not changing 42 days to detect a problem.

Data that requires long term retention of one year or more should really backup to a different instance dedicated to long term backups using inexpensive media.
 
Company policies

You are right about the retention policy. However, I have already saved the company millions restoring files from years back due to legal requirements. This is what I am talking about, not databases and mp3 files. Legal documents, that I am not allowed to delete. I have more than 1 management class, and some are set to 2 days. So I appreciate your comments, but don't think I have NO policies in place. This specific mgmt class needs to be reviewed via some type of script, and I sure hope someone can code me one.

R
 
Then I'd again say that HSM is the hot ticket.

If a manual process was all you could get away with .... I guess you could do a move node data and condense all its data on to a specific set of tapes. Other options would be to collocate the data for that node to its own media. Or... you could deploy an active data pool and send everything else to tape.
 
On the general topic of segregating the "deleted" versions you can set up an aged migration from one tape pool to another after a set time has elapsed, say a year or 2. (MIGDELAY) It can even be set up as a monthly scheduled job. This assumes that you do not have any other data that will last as long as whatever delay you choose and thus anthing that matches that migdelay criteria is all that will be migrated.

Those tapes can be ejected and placed someplace else until such time as you delete them or the data expires.
 
@rwhtmv

Its important to recognise that TSM classifies each file object into two states - ACTIVE and INACTIVE. When a file is delivered into the stgpool heirarchy (as a result of a backup) it is classified as ACTIVE. Files only become transmuted into the INACTIVE state as a result of a subsequent backup - there are two ways for files to become inactive. Firstly when a later backup of the same file object is taken (due to it changing, for example, or as a result of a selective rather than an incremental) thus superceding the current copy, or if a later backup detects that the file is either deleted from the client, or excluded as a result of an inclexcl statement.

If you are able to filter out the special cases (e.g. a modification to an inclexcl statement), you should be able to find them easily by acquiring a list of filesystem objects which have at least one database row in an "INACTIVE" state but none in an "ACTIVE" state.

Cheers,

T
 
Well with RWHTMV we are really only talking about the very last version of a deleted item because active data should always be there and expired data is gone.

This has HSM written all over it.
 
One of my customers asked for the same thing. What he does is look in the dsmsched.log for each client and looks for the expiring/deleted tag and collects it in a seperate file.
 
Back
Top