Large NTFS File System, 88+ million files and growing...

bbrown00

ADSM.ORG Member
Joined
Oct 1, 2002
Messages
5
Reaction score
0
Points
0
Website
Visit site
I have just been informed that a document imaging system that has gone production this week has data on several optical juke boxes that will be migrating. We planned for the capacity on spinning disk, but the number of files were not known until today. There are 88+ million files existing, not including growth. The files will be migrating over the next 18 months. The system is a W2K3 MSCS with NTFS. I have a few questions about how to deal with this...

I have 2 TSM servers 1 with 51327206 files on it (used ...'select sum(num_files) from occupancy')and 20GB used DB, the other with 81796176 files on it and 29GB used DB.

1. If only 1 version is kept of the images, between primary and copy I should have 177+ million on just the existing data not including growth over the next year. I get roughly 60-65 GB TSM DB just for the existing data...where did I read that you shouldn't go over 80 GB for a TSM database, is that true?

2. Anybody ever used TSM journaling with a W2K3 NTFS system with88+ million files. Looks like that would be treading new waters from some of the posts I have read.

3. Any architecture suggestions...i.e. completely seperate TSM server, library, journaling config, etc... Current architecture is 2 W2K3 TSM servers connected to an IBM 3584 library logically split down the middle with 10 LTO3 fibre attached drives(5 per server).
 
Last edited:
1- there is supposed limit of 300gb, but I read somewhere that there was one over it in prod
the problem you can see is on the backupDB & expire processes,
TSM say that growth has no impact on performance

2- I will not say that it will work, but I'll sugges you to splits whese files in maximum filespaces (Windows) partitions, it's what we do here for our 10M files 3 data servers, never got problem with journaling

You can also split you server in two differnt node, so that you can creat 2 jounaling services.

3- the best issue about your problem it to consider the "HSM" it is supported by TSM since the 5.3 version
you can also have a look on the Image Backup solution, but till now you cannot restore a single file from it so not a everyday solution,
but it will be very helpfull for a DR in you case....
 
Last edited:
I have a database that is over 150GB that handles multiple servers that have 5+million of files each (split them up so it's easier to handle)

If you can move data off the server onto tape via HSM, thats your best option. Other than that, you may need to enable co-location on the tape-based backups so that a restore doesn't take weeks.

-Aaron
 
bbrown00,

I have 2 document imaging systems (one by Siemens and one by IDX). They are currently 77 million files and 20 million files. The larger one is expecting to at least double in the next or two. I tried Journaling (had about 15 disk with 4-8 million files per each), but decided to do imaging instead. Journaling when it worked was great. When it did not, it required that I do one time regular incremental backup which took almost a week to complete. Then Journaling would start working. The problem was that Journaling would stop about once a month for some reason or it was time for an TSM upgrade and then I had to repeat the entire thing.

Doing TSM's SNAPSHOT imaging was the solution. Just make sure that you leave some free space on each disk. I have had to set my snapshotcachesize between 10 and 20 (this is 10% and 20% of the drive letter). I have setup for each disk it's own backup job using the command:

DSMC Backup Image x: -optfile=x:\tivoli\dsm.opt

where x is the drive letter. I found that backups finish faster if you have smaller multiple backups going as apposed to one big backup. You just have to spend more time up front setting it up.
 
the problem you can see is on the backupDB & expire processes,

By the way, expiration performance is mostly related to how many inactive files you have. If 99% of your 88million files just sit there without changing or being deleted, they'll be active versions and won't slow down expiration.

They'll certainly slow down your DB backup and audits and unloads/loads though ;)
 
I wouldn't consider HSM - especially on such a client. It will not effect incremental backup in any positive way since the number of objects won't reduce on the client and even increase on the server. Best shot imho is the "multiple node" approach. Split the server into smaller portions and let TSM have 20-50 concurrent bites at the apple. The server shouldn't have a problem (unless you've got the entire DB on a single disk or something like that). If best restore performance is an issue, consider image backup and/or keeping the primary pools for incrementals on disk.

PJ

P.S. If your imaging application reliably doesn't shift data around but simply adds files to an otherwise static filesystem, you can run "incremental by date" and throw in a standard incr for expiration once a week or so.
 
Last edited:
I have worked on TSM servers with databases over 250gb - with sufficient processing power they have not proved to be an issue regarding expiration etc.
 
Per Tivoli recommendation...

Ah well. Just forgive them. The development guys struggle along on hardware from the last century - what do they know about a DB restore from LTO3 or 4 when all they've got is an old P40 with an internal 4mm drive ;)

PJ
 
Back
Top