Large number of files

Mita201

ADSM.ORG Senior Member
Joined
Apr 20, 2006
Messages
601
Reaction score
31
Points
0
Location
Beograd, Serbia
I am facing a common problem with large number of files, but I am not very experienced with the solutions to it:

I have jfs2 filesystem that, at the moment has about 7mil relatively small files (up to few hundreds kbs each) there is about 20 000 - 50 000 new files generated daily at the moment which will slowly decrease in next year or two to the level of about 5 000 files daily. I have journal that works, it goes to disk pool, I did a bit of txngroup optimization, but still it takes about 18 hours for incremental backup. I am expecting to have about 400mil files in that fs once, and maybe more. Is there any kind of optimization I can count on, or I should think about solving the issue on some other way?
 
If they are small files, you might want to look into making an image of the filespace.

One image is one pointer object in the TSM database, instead of millions of pointers... thus keeping your TSM database smaller.

Be aware that you cannot restore one file from the image. You would have to restore the whole image. But if They are small files, it might be work for your situation.
 
Well, it is one possible solution, but files are not that small, as I said some of them are several hundreds of kilobytes, so I think that filesystem will go to terabytes soon, and taking backup of TB image is still lot of hours.
 
Ok this happened to me all the time at Honeywell, and I cannot tell you how many applications get built by boneheads who don't understand filesystem limitations. I had one situation where the application put so many files in a filesystem that the OS could not return an LS. This is very bad since TSM relies on the same funtionality to read the filesystem and scan the files for backup requirements. The crazy thing is that the problem caused TSM to crash during backup but return a status of COMPLETED to TSM so we didn't know it was failing before finishing the rest of the server....until a restore request was required. This was many versions of TSM ago...4.1 so that bug is fixed, but the outlying problem can be found in many environments.
 
mmm, thanks people. As I can see from your answers, I think I'll have to handle it other way than finding optimal way to backup that huge number of files per file system.
I was just thinking at the begining, it is on pretty good and new box, AIX is 5.3 tl7, jfs2 should (at least when reading documentation) be able to handle that big number of files, but it won't go this way, as I can see.
Thank you once again.
 
Mita
Long time no speak -

You honestly have three options.

1. Segment your backups by alpha numberics
2. Break down your data by Month/year and do archives.
3. Incremental by date - also segment by alphanumerics.

Segment your backups [a-f][g-n][o-z]
Run twice daily until you are caught up.

Steve
 
You should take a look at GPFS. It has a better chance of managing the number of files you are looking at and the backup process is integrated in the file system.
 
Back
Top