Instantly Windows node is failing with open files error. Please help me

huttivoli

ADSM.ORG Member
Joined
Mar 21, 2005
Messages
72
Reaction score
0
Points
0
TSM server -: 5.5.1

Windows 2003 (Cluster)-: TSM client 5.3.6.2 With TDP for Oracle too

Last week Wintel team upgraded the node with below patches.


MS08-068: Vulnerability in SMB could allow remote code execution (957097)
MS08-069: Vulnerabilities in Microsoft XML Core Services could allow remote code execution (955218)
MS08-071: Vulnerabilities in GDI could allow remote code execution (956802)
MS08-073: Cumulative Security Update for Internet Explorer (958215)
MS08-078: Security Update for Internet Explorer (960714)- Critical

After this backup start failing with open file error.
Code:
12/31/2008 02:10:18 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB' failed
12/31/2008 02:10:18 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB': the object is in use by another process
12/31/2008 02:10:21 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB.LOG' failed
12/31/2008 02:10:21 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB.LOG': the object is in use by another process
12/31/2008 02:11:09 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\trace_tmnt_providers.log' failed
12/31/2008 02:11:09 ANS4037E Object '\\sg01act02abx525\c$\WINDOWS\system32\trace_tmnt_providers.log' changed during processing.  Object skipped.
12/31/2008 02:11:12 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edb.log' failed
12/31/2008 02:11:12 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edb.log': the object is in use by another process
12/31/2008 02:11:12 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edbtmp.log' failed
12/31/2008 02:11:12 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edbtmp.log': the object is in use by another process
12/31/2008 02:49:52 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\wbem\Logs\FrameWork.log' failed
12/31/2008 02:49:52 ANS4037E Object '\\sg01act02abx525\c$\WINDOWS\system32\wbem\Logs\FrameWork.log' changed during processing.  Object skipped.
12/31/2008 02:49:54 ANS1802E Incremental backup of '\\sg01act02abx525\c$' finished with 6 failure

12/31/2008 02:49:54 ANS1802E Incremental backup of '\\sg01act02abx525\c$' finished with 6 failure

12/31/2008 02:56:59 ANS1228E Sending of object '\\sg01act02abx525\e$\Oracle\ora92\network\log\fslshared1.log' failed
12/31/2008 02:56:59 ANS4037E Object '\\sg01act02abx525\e$\Oracle\ora92\network\log\fslshared1.log' changed during processing.  Object skipped.
12/31/2008 02:56:59 ANS1802E Incremental backup of '\\sg01act02abx525\e$' finished with 1 failure

12/31/2008 02:56:59 ANS1802E Incremental backup of '\\sg01act02abx525\e$' finished with 1 failure

12/31/2008 05:52:10 ANS1512E Scheduled event 'ABX_BACKUP_0200' failed.  Return code = 12.

As per wintel admins these files are important and required to restore the server.

This failure happend only after patch was made.

note -:
These 6 opened files mentioned above are created long back but all are modified with the same date when patch update was happened.

Currently I have no clue what to do as these files can't be excluded.

I am planning to update the client to latest version to 5.5.1. If this will not work then I will install LVSA Utility (open file support).

Please suggest what are the steps I can follow to solve the issue.

Please help its very critical
 
If you upgrade the client this should resolve the situation, but until then you need to exclude the CLUSDB and any other files that could be part of the system state that are giving you errors.
 
The BA client should exclude the CLUSDB files (hard-coded) and back up it as part of the systemservices (windows 2003).

The 5.3.6.x version is a special build to support windows 2000 client. It migth be not appropiate for your windows 2003 servers.

As ChadSmall recommended, test upgrading your BA client.

Rudy
 
Chad, Can you clarify this for me? I had the same issue on a MSCS cluster and every node had the "file in use" error message for CLUSDB. I read the Microsoft cluster configuration recommendations in appendix D of the BA user guide and found no reference stating to exclude it. My google search turned up little-to-nothing, except for a post on the old adsm.org site, so I configured the clients for open file support and had no furhter errors with it but am obviously concerned with file integrity?

I did read plenty of posts from various sites stating that during a DR test the TSM engineer was unable to recover the cluster nodes because the drive and system state restore did not recover the CLUSDB and CLUSDB.LOG

Any feedback/references would be great !

TSM Server: 5.4.1.2
BA Client: 5.5.0.6

Thanks
~Rick
 
Last edited:
Thanks for answer


Just want to clarify one for thing. In this situation, Do I have to restart Client node after upgrading the TSM version (Usually its not required). Also, how effective is this open file support utility (LVSA)? I read somewhere that you have to restart client after installing LVSA, is that true?
 
Thanks for answer


Just want to clarify one for thing. In this situation, Do I have to restart Client node after upgrading the TSM version (Usually its not required).

No, if you stop all schedulers and CADs and if you are not upgrading/adding LVSA support. Always test in non-production env.

Also, how effective is this open file support utility (LVSA)? I read somewhere that you have to restart client after installing LVSA, is that true?
There was bugs in the past but with the last MS patches and latests BA client version it seems stable now. However I do not see the worth of backing up files that are in inconsistent state (open) in most of the cases. Also they are generally (not all) temporary files.

You have to restart the system after install LVSA.

Rudy
 
I have updated the client to 5.5.1 but getting the same errors. I think these files need to be excluded. not sure how the backup was happening fine before wintel team updated the MS patches on the system.


01/07/2009 20:17:08 ANS1038S Invalid option specified
01/07/2009 20:28:02 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB' failed
01/07/2009 20:28:02 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB': the object is in use by another process
01/07/2009 20:28:05 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB.LOG' failed
01/07/2009 20:28:05 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\Cluster\CLUSDB.LOG': the object is in use by another process
01/07/2009 20:32:11 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edb.log' failed
01/07/2009 20:32:11 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edb.log': the object is in use by another process
01/07/2009 20:32:13 ANS1228E Sending of object '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edbtmp.log' failed
01/07/2009 20:32:13 ANS4987E Error processing '\\sg01act02abx525\c$\WINDOWS\system32\tssesdir\edbtmp.log': the object is in use by another process
01/07/2009 21:01:36 ANS1802E Incremental backup of '\\sg01act02abx525\c$' finished with 4 failure


Now I am not able to start the scheduler service. I tried to reinstall the schedule too but getting the same error.

Error message attached.

Please help
 

Attachments

  • schedule error.doc
    107.1 KB · Views: 8
Did you test in in non-production env?

The patches you posted seems to have nothing to do with this issue.

Let's see:
1 .This failing backup is the backup for one of your nodes or for a cluster group?
As you should know, a different config should exist for each node and for each cluster group.
2. In what config (node or cluster group) are you using the 'clusternode' option for the BA client?

If your config is correct you might want to open a case with IBM support.

Rudy
 
Excluding CLUSDB and CLUSDB.log

I have just carried out a DR test of to two MSCS enviornments successfully, one on a v5.4.2 client and one using a 5.5.1.6 client. In both cases CLUSB and CLUSDB are excluded somehow by the Client automatically and backed up using the system state by TSM/VSS. I also have 2 other clusters in our environment that seem to be not excluding automatically these files and giving the same error. The 2 that are erroring and failing to backup CLUSDB and CLUSDB.log are single node clusters with a local quorum's and I just wondered if that is the reason why.

Following our sucessful DR test I was going to put in a manual exclude on these 2 nodes
 
Back
Top