TSM for VE & snapshot causing VM to crash

mart2000

Newcomer
Joined
Nov 30, 2012
Messages
1
Reaction score
0
Points
0
PREDATAR Control23

We are running the following TSM for VE environment
TSM server version 6.2.4.0 running on AIX 6.1 TL7 SP3
The datamover server is windows 2003 enterprise SP2 with TSM client version 6.3.0 and TSM for VE version 6.3.0

We have this configuration backing up VM's in 3 different vmware clusters. In one of our clusters only we are experiencing an intermittent issue. After the backup is completed and TSM for VE is deleting the snapshot, the VM will power off and be unable to be powered back on. One of the virtual disks is listed as having a size of zero bytes. A call to vmware has resulted in finding the descriptor file has changed and it is not referencing the correct disk. Once this file is changed and pointed to the correct disk we can power on the VM, take a quick snapshot and then delete all snapshots to clean up. VMware claims it looks like this file was either manually changed (unlikely) or was updated by 3rd party software (always happens after TSM for VE backup completes so probable). IBM so far has not found anything in the logs to point at what is causing this. This issue has happened intermittently on a couple of test VM's in the past couple of months. It happened on a production VM 2 nights ago resulting in the server being down for 2 hours while troubleshooting with vmware. The cluster in question is used for Oracle only VM's so all the VM's are running windows 2003 with Oracle. Has anybody else experienced similar issues?
 
PREDATAR Control23

Hello, I've had the same problem. Can anybody propose a solution?
 
PREDATAR Control23

We are running the following TSM for VE environment
TSM server version 6.2.4.0 running on AIX 6.1 TL7 SP3
The datamover server is windows 2003 enterprise SP2 with TSM client version 6.3.0 and TSM for VE version 6.3.0

We have this configuration backing up VM's in 3 different vmware clusters. In one of our clusters only we are experiencing an intermittent issue. After the backup is completed and TSM for VE is deleting the snapshot, the VM will power off and be unable to be powered back on. One of the virtual disks is listed as having a size of zero bytes. A call to vmware has resulted in finding the descriptor file has changed and it is not referencing the correct disk. Once this file is changed and pointed to the correct disk we can power on the VM, take a quick snapshot and then delete all snapshots to clean up. VMware claims it looks like this file was either manually changed (unlikely) or was updated by 3rd party software (always happens after TSM for VE backup completes so probable). IBM so far has not found anything in the logs to point at what is causing this. This issue has happened intermittently on a couple of test VM's in the past couple of months. It happened on a production VM 2 nights ago resulting in the server being down for 2 hours while troubleshooting with vmware. The cluster in question is used for Oracle only VM's so all the VM's are running windows 2003 with Oracle. Has anybody else experienced similar issues?

I don't exactly know what is going on here but in my opinion, you should not use TSM for VE on Guest VM's running Oracle.

I also question, again my two cents, why even run Oracle on Guest VMs and on Windows! My guts tell me that when TDP for VE takes that snapshot, the Oracle environment shuts down thus bringing the entire system down. I believe this is a compatibility issue.

If I were to implement a backup for this type of environment, I would go with traditional TDP for Oracle, BA client backup and VDK disk image backup.
 
PREDATAR Control23

I know this is an old question, but I may have run into the same thing. I was testing TSM for VE in our test environment and got my hand slapped fro crashing an Oracle cluster. Apparently there is something called " multi-writer flag" in ESX for that cluster configuration that makes it incompatible with snapshots.
 
Top