Legal hold / eDiscovery over ... next steps.

QueueBall

Active Newcomer
Joined
Aug 2, 2012
Messages
5
Reaction score
0
Points
0
I've been using this forum for quite a while, but never had the opportunity/need to post. Most of my answers were found lurking in the background and searching previous threads. I haven't seen much about my "issue", so I thought I would post and see what feedback I could get.

After 4 years of a legal hold / eDiscovery mandate, our need to retain everything has been lifted. A bit of background information.

Environment - Hardware
AIX 5.3.12 SP4 running TSM 6.2.2.
Library - L700 LTO-2 (phased out but still connected)
SpectraLogic T50e LTO-5 (fully configured and managing all data to tape transactions)
DataDomain 890 for VTL (recently configured and managing entire environment)

TSM - Notable information

Single TSM Database @ ~560Tb.
1000+ nodes
8Pb occupancy pool
7-8Tb backed up nightly
~90 AIX hosts (active)
~500 Windows hosts (active)

With so much data residing in 1 database, we can't make a drastic change like updating the retention limits on and entire policy, it would kill the server. Instead, we are looking at taking smaller bites of the apple. These include (but are not limited to) the following ...

Remove nodes that have been decommissioned
Mothball the environment and start over
Start 2nd instance of TSM and share resources. Migrate nodes to new server.
Create all new policy groups and migrate hosts to it 1 by 1.

Has anybody else had the same scenario and how did you resolve. At this point, we are in the planning phase of how we will attack this giant problem. We know and understand it will take several months to complete. I've never managed TSM in a "normal" environment, so going through 400 tapes per week and working all day to keep pools low enough to get through the next days backups is all I know.

Any help from people who have went through a similar scenario would be appreciated.

Regards,

QueueBall
 
You have an interesting problem.

First of all, how many policy domains do you have? How many management classes? How many nodes have been decommisioned?

I would 'attack' the issue this way:

- delete filespace from the decommissioned nodes as you have already cited; finish all of the decommisoned nodes' filespace deletion running expiration as needed
- if you have lost of policy domains and lots of management classes within these policy domains, reduce retention slowly one or two at time and run expiration
- continue with the above until you have reached the desired level
 
Thanks for the reply moon-buddy.

We have ~35 policy sets (we will remove 5-7 of these as they are no longer used).
We have ~175 management classes (we will remove 30-40 of those as they are no longer used).
We have ~450 nodes that need to be removed from TSM due to decommission.

A question I have is when I start to remove filespace(s) from a server that had a 2Tb database on it (full backup nightly for 4 years), will it slow TSM to a halt as it recursively goes through the DB2 database and removes "stuff". I'm concerned that the impact to TSM will cause daily processing to degrade or even cease. Also, since I work closely with our SAN team, should I increase my active/archive log file sizes? I'm concerned that I will fill our logs quite quickly and will end up running a full backup every few hours just to clear the logs.
 
Thanks for the reply moon-buddy.

We have ~35 policy sets (we will remove 5-7 of these as they are no longer used).
We have ~175 management classes (we will remove 30-40 of those as they are no longer used).
We have ~450 nodes that need to be removed from TSM due to decommission.

A question I have is when I start to remove filespace(s) from a server that had a 2Tb database on it (full backup nightly for 4 years), will it slow TSM to a halt as it recursively goes through the DB2 database and removes "stuff". I'm concerned that the impact to TSM will cause daily processing to degrade or even cease. Also, since I work closely with our SAN team, should I increase my active/archive log file sizes? I'm concerned that I will fill our logs quite quickly and will end up running a full backup every few hours just to clear the logs.

This should be manageable.

However, as you said, increase LOG space and you may need to do frequent DB backup.

As for deleting filespaces, delete one filespace after the other if you have multiple ones on a node. Don't delete all at the same time.

Alternatively, reduce slowly retention policies as granular as you can thereby minimizing impact on the system.
 
Last edited:
Thank you again for the feedback, moon-buddy.

We have a meeting with management, legal, and technical services to determine what we can and can't do. I'll post here and see if anybody else wants to contribute.

Thanks again!
QueueBall
 
Back
Top