Evaluating Virtual Environment Backup Solution

rowl

ADSM.ORG Senior Member
Joined
May 18, 2006
Messages
266
Reaction score
10
Points
0
Website
Visit site
I am looking for a solution to backup virtual machines. This solution needs to be able to scale to at least 10,000 guests in the short term and up to 20% annual growth. Ideally this solution should provide both VM and file level recovery, with the file level recovery not being significantly different from what we have been sending to our operations group for years (open GUI, find file, click restore). We have both TSM and Avamar backup solutions in house, both have solutions in this space that we are looking at. I don't want to limit my search to these two products just because we happen to already have them for other reasons.

When looking for your solution in this space, what products did you look at and why did you choose TSM over the rest? When I Google search for VM backup solutions there are a lot of hits, but other than the IBM, EMC, Veritas, Commvault names the rest are unknown to me. Are there other serious contenders?

Thanks,
-Rowl
 
I have gone through this process two years ago when IBM initially introduced TSM for VE.

The POC, Test, PROD and DR cycles were all completed and it was a painful experience from a DR perspective. This was due to a defective implementation process with restore. IBM gave me a workaround then, and later resolved the issue. (search for my old posts about this)

Another lesson learned is the number of nodes to backup per TSM server (again, I have posted about this). It will be detrimental to backup more than 300 or so nodes per TSM server. For 10000 guests, you will need to have about 34 TSM servers.

Doing a EMC solution is not also feasible from a cost perspective. Likewise, a Commvault solution is akin to an IBM solution.

The way we run VM backups is to do the traditional BA client backup, do regular VDMK backups (via VM Ware tools), and backup the VDMK images to TSM. This is a hybrid solution.

Too much admin work here but this is a better and faster way for DR.

Of course, you have to define your priorities: 1) fast backup and slow recovery, 2) slow backup and fast recovery, or 3) equal backup and recovery speeds.
 
Last edited:
Thanks for the feedback. One of the mandates I have been given is to eliminate any in-guest clients. I have reservations about that, but it is what it is. The 300 nodes / TSM server is a bit alarming, I will have to look up your other threads on that one. Using in guest clients today I am able to backup this environment with 10 servers.
 
Another lesson learned is the number of nodes to backup per TSM server (again, I have posted about this). It will be detrimental to backup more than 300 or so nodes per TSM server. For 10000 guests, you will need to have about 34 TSM servers.

Still trying to wrap my mind around this architecture. So I am assuming you had to setup multiple TSM4VE vCenterBackup Servers and collections of Data Movers in a single ESX vCenter to accomplish this? As the environment has been explained to me here, we have a single vCenter defined for the environment in each data center. So ~5000 guests spread out across multiple ESX clusters managed by a single vCenter per data center.

That 300 nodes limit is concerning, was that a limitation of the hardware at the time? I can't seem to find your post on that.

Thanks,
-Rowl
 
Still trying to wrap my mind around this architecture. So I am assuming you had to setup multiple TSM4VE vCenterBackup Servers and collections of Data Movers in a single ESX vCenter to accomplish this? As the environment has been explained to me here, we have a single vCenter defined for the environment in each data center. So ~5000 guests spread out across multiple ESX clusters managed by a single vCenter per data center.

That 300 nodes limit is concerning, was that a limitation of the hardware at the time? I can't seem to find your post on that.

Thanks,
-Rowl

Yes.

The issue is more emphasized for environments that use tapes for online and DR pools. The number of 300 maybe a very conservative at the time I did my work. You can try to increase this to 500 as long as you can meet RPO and RTO numbers.

Since my original work for TSM for VE back in Canada, I did not move forward with the TSM 'way' but, as I mentioned, a hybrid solution. This proves to meet my RPO and RTO numbers.
 
I understand this is an older post, but it is relevant to me now. My environment is now 99% virtualized, and therefore, I am looking into TDP for VMware.
However, my SAN team is hesitant in moving forward with this architecture and leave everything as it has been at BA client on every host ( treating as a Physical svr), the reason being : They think this will be be intrusive to their environment as the agent will be installed on each ESX server. And also, it will be taking 2 backups ( 1 at VMware level/image and other at Application ( We have TDP for Domino, etc)
But from what I have read the TDP for Vmware is installed on the vStorage backup server which is another physical or virtual machine, which acts as a data mover. So there is nothing installed on each ESX server. ( Am I correct ?)
 
The newer TSM 7.1.1 (as claimed by IBM) has improved the TSM for VE backup 'experience' significantly.

Numbers cited (in a presentation given for our company) for incremental backups is like 500 virtual nodes in 15 minutes - amazing! I still have to see this for myself. I am right now during a POC for this.

As for Domino backup, I would go with the hybrid solution: BA and TDP on the virtual node. I am hoping that you don't run all Domino on all virtual servers! If this is the case, no need to run TSM for VE.

Exclude the Domino servers from the VE list during a TSM for VE backup.

Yes, you need Proxy servers for Windows and Linux - one for each OS platform. I believe one Proxy can handle 300 nodes 'safely'. I don't know the maximum it can handle before bringing up another Proxy.

More so, the new TSM for VE 7,1,1 can do file level restores from Image backup as done by TSM for VE. This where OS-bound Proxies are needed: Windows proxy to restore Windows files and Linux Proxy to restore Linux files.
 
Last edited:
Good info moon-buddy!! So I guess I would need about 2-3 proxy servers ( as I have 600 + VMS) and good to know that no agent will be installed on the ESX machines. Now I can tell my SAN team to back off!!

So my TSM env is abt 600 + VMs , currently being protected by Single Instance of TSM svr running on Linux, have a DataDomain appliance ( primary stg pool) NFS mode. And Copy Pool is LTO that go offsite to IM. As I don't have a 2nd DataDomain to replicate data to.
I am doing about 6 TB a day, DataDomain depuplication ratio is 3.5:1 , and of course when I backup stg to LTO, it rehydrates data back onto Tapes, which amounts again to 6 TB.
All TSM maintenance activity along w/ client backup fits in my 24/7 cycle.

Now, before if I move onto TSM for V.E, I wanted to make sure VMware VADP to do CBT doing incremental forever will not add any more data nor it should lengthen the time of backups, otherwise it will jeopardize fitting all TSM activity.

Also currently RPO for Mail is set for 1 hour. The way I achieve that is run Full's on weekend and then archive txn logs every hour so I can roll back upto the hr and reapply the logs in case restore is needed. As my Domino env is also virtual, I would continue to do the same TDP-Domino and BA backup and leave out the V.E protection as you mentioned.

* Honestly, my mgmt is against keeping TSM around any longer, as they hate IBM's poor customer relationship. And even more the way TSM is being licensed (PVU) hell. Each time the ESX gets upgraded the PVU no. goes up with no added benefit to backups. The only thing prevents any other backup tech to replace TSM, for instance Veeam is because of Domino as there MS-VSS is not compatible w/ it, and therefore can only do "crash-consistent" backup.
I have Avamar deployment too but that is for smaller satelite sites as it is too expensive to replace TSM.
( Sorry for crying.. I love TSM.. it is superioer to any other traditional backup tech like Networker, Simpana, Netbackup)
 
The plan sounds reasonable. Go ahead test the process and if all is OK, move to PROD.

In the case of IBM's poor customer "relationship", I would say that IBM really has gone downhill in the past few years. Their size speaks of it. A huge employee base with shirking sales drives them to cut corners. Competitors are so customer focused to the extent of treating customers to lunch outs almost bi-weekly if not weekly!

As for PVU pricing, why not look at per TB pricing. This may save you tons of money and not to worry about individual licensing like TSM for VE, TDP for Domino, etc.

Having said all of these, I still am a solid IBM solution guy. I have seen and used other products and I am still convinced that TSM is 'the product' to use. Period.
 
Back
Top