• Please help support our sponsors by considering their products and services.
    Our sponsors enable us to serve you with this high-speed Internet connection and fast webservers you are currently using at ADSM.ORG.
    They support this free flow of information and knowledge exchange service at no cost to you.

    Please welcome our latest sponsor Tectrade . We can show our appreciation by learning more about Tectrade Solutions
  • Community Tip: Please Give Thanks to Those Sharing Their Knowledge.

    If you receive helpful answer on this forum, please show thanks to the poster by clicking "LIKE" link for the answer that you found helpful.

  • Community Tip: Forum Rules (PLEASE CLICK HERE TO READ BEFORE POSTING)

    Click the link above to access ADSM.ORG Acceptable Use Policy and forum rules which should be observed when using this website. Violators may be banned from this website. This notice will disappear after you have made at least 3 posts.

fsck caused .BFS files to get moved to lost+found. How find file names?

alexp36

ADSM.ORG Member
Joined
Jun 14, 2018
Messages
14
Reaction score
0
Points
0
Need some help recovering from an AIX filesystem issue.
We unmounted the filesystem where our disk storage pool sits.
When trying to re-mount the filesystem, it wouldn't mount, and needed us to run an fsck.

After the fsck, all the TSM disk volume files are gone, and it looks like they've been moved to the lost+found directory.
All the file names have been changed to 2-3 digit numbers, like "65", or "100".

So, my understanding is we should be able to just rename them, and move them back out of the lost+found directory.
But, there are around 140 of them, and no way of knowing which lost file was which .BFS file.

Is there anyway of reading a .BFS format at the AIX level? Anyway of finding out what the filenames were?
We have a call open with IBM about this also, but so far they haven't been very helpful.

Thanks for any help or suggestions.
 

RecoveryOne

ADSM.ORG Senior Member
Joined
Mar 15, 2017
Messages
327
Reaction score
75
Points
0
Likely not what you are looking for but I would look at restoring your storage pool from copy pool.
lost+found is exactly that. It's were fsck puts files or fragments of files that it has no idea of where they go.
The numbers you are seeing should represent the inode in which those files/fragments lived.

If IBM could some how read the headers of the file based backup files, they might be able to determine which was what. As far as I know, standard methods for trying to work out what was what (strings, file, od, lquerypv) is next to impossible.

I'm assuming the .bfs files are a file based volume. To see what is damaged, I think you'd need to run an audit.

Since you said you've engaged with IBM support, I'd lean on their expertise.

I'd also be concerned as to what caused the inconsistencies? Something such as https://www.ibm.com/support/pages/apar/IJ21577 could be a cause (Just first apar that came to mind). Once you've identified what has caused the inconsistencies, effort should be made to correct.

In my short time of being an AIX admin (8 years now) I've only had four events where I had to fsck a jfs2 filesystem. Three were on redundant vios after running an updateios or upgradeios command. One was on a production HA cluster that we actually had a SAN storage issue with and had to restore data from backup. Long story short, old NetApp had WAFL issues and the whole pool was lost.
 

alexp36

ADSM.ORG Member
Joined
Jun 14, 2018
Messages
14
Reaction score
0
Points
0
Thanks for that, your ideas all tally with what we have found. Unfortunately no copy pool in this instance, for, ahh, "reasons". My colleague has worked through it with IBM and managed to get the volumes back online now.

We pretty well know what caused the inconsistencies - there was a SAN disk issue about a week ago, which took out the disk for about 6 hours.
Oddly there had been an automatic filesystem recovery by AIX after the SAN issue was sorted last week, and all appeared to be okay after that.

It wasn't until we unmounted the filesystem for an unrelated reason, and tried to re-mount that we found we had problems.

I've had to run fsck's many times (around 20 years in AIX :) ), and never once actually "lost" any files.
I was fairly well stumped initially. fsck appeared to complete successfully, great, filesystem mounted okay, awesome, and then... no files. uhoh.

I'm actually on holiday right now, so I didn't get too involved, but I'll find out in detail what the procedure was when I'm back on Monday, and post some info here.

Cheers.
 

RecoveryOne

ADSM.ORG Senior Member
Joined
Mar 15, 2017
Messages
327
Reaction score
75
Points
0
Ahh love "reasons".
San issues are a pain sometimes. And yeah only time fsck didn't work was when the underlying storage was just so messed up.
Glad everything got sorted.

Enjoy your holiday! Stop thinking about work.
 

Advertise at ADSM.ORG

If you are reading this, so are your potential customer. Advertise at ADSM.ORG right now.

DigitalOcean $100 Credit

Support ADSM.ORG and get DigitalOcean FREE credit. DigitalOcean currently offer a $100, 60-day Free Credit for new accounts. Sign-up here:

DigitalOcean Referral Badge

The Spectrum Protect TLA (Three-Letter Acronym): ISP or something else?

  • Every product needs a TLA, Let's call it ISP (IBM Spectrum Protect).

    Votes: 20 18.7%
  • Keep using TSM for Spectrum Protect.

    Votes: 65 60.7%
  • Let's be formal and just say Spectrum Protect

    Votes: 13 12.1%
  • Other (please comement)

    Votes: 9 8.4%

Forum statistics

Threads
31,884
Messages
135,935
Members
21,784
Latest member
london
Top