backup vm lan-free to tape performance issue

fhignz

ADSM.ORG Member
Joined
Mar 14, 2003
Messages
16
Reaction score
0
Points
0
Location
Berlin
Website
http
Hi there,

our VMWare environment consists of six ESX servers hosting about 100 VMs, all located on SAN storage.
Installed within this environment is an off-host W2K8 Server as a backup proxy which has access to the VMWare-SAN storage.
It also has a second SAN connection to the backup-SAN which consists of a couple of LTO-4 tape drives in a tape library
Installed software on the proxy server:
Windows B/A client 6.3.0 with VMWare component, TSM for VE 6.3.0, TSM Server 6.1.4.3, IBMTape 6.2.1.8

Then I perform a "backup vm" with the Win B/A GUI to a node defined on the locally installed TSM Server, VM size 200 GB:
1* when writing directly to tape I get network rates of around 140 MB/sec but an aggregate of only 20 MB/sec
2* when writing to a disk volume I get network rates of 100 MB/sec and an aggregate of 90 MB/sec
3* migrating the disk volume to tape is done with an average rate of 80 MB/sec

ad 1*: during the backup I notice sort of a "saw tooth" curve: data is read from disk with rates of around 140 MB/sec for one second (peak), then come 2-3 seconds of zero-transferrate (valley), then again a one-second peak of 140 MB/sec and so on.
ad 2*: during this backup I notice a smooth backup rate with no peaks and valleys. The TSM Server disk volume is located on the same physical disk array as is the VM.
ad 3*: during the migration from disk to tape I notice a saw tooth curve again but the peaks are much wider, e.g. transferrate 120 MB/sec for approx. 10 seconds, then comes a valley of 2-3 seconds and so on.

Any ideas or hints of what to look at are highly welcome.

Friedrich
 
Welcome to my former nightmare!

I have seen this problem, and worked with IBM on this and found a work around.

Disclaimer: This was on TSM BA 6.2.3 and TSM for VE 6.2.

I thought they had addressed this on later versions of TSM for VE and the BA client but it looks like it is still there. Here is what I have learned (and understood) in the 3 months of working on this problem.

The issue is the way IBM 'chunks' the data file - the image portion of the VM Ware data. The image is chunked in 135MB sizes and according to R&D, it is not 'tunable'. Each chunk is accompanied by a small CTL or control file. This file tells TSM how the image chunks are organized in relation to the rest of the chunks.

The image file or DAT file, if you will, is then paired with the CTL file and sent of to the tape or disk. The perfomance hit will not be seen when writing to disk. The hit is when you backup directly to tape and when you restore from tape. The images are stored like this or (something similar): CTL-DAT-CTL-DAT--- etc. My understanding at that time is the algorithm that does this needs tweaking.

The resolution:

Store separately the CTL and DAT files!

How it was resolved with IBM's suggestion?

In the dsm.opt file, a parameter is introduced: VMCTLMC which points to a MC of its own that stores data on DISK (ONLY!) and backed up to tape. On DR or restore, data must be returned to disk (DR) for fast restore or can be left on disk or primary tape (restore).
 
Yes, We are using two MC in dsm.opt. One is VMDATAMC for backing up vm data to tape and another is VMCTLMC for storing data on DISK.
But the lanfree speed is not satisfactory. The lanfree transfer speed is about 1G per minute, we can see this speed from BA GUI.
This afternoon, we have done one backup test.

If you want to back up one VM about 30GB, you will use 25 minutes.

Does anybody have real transfer speed to share with us?



We are using TSM Server/BA client 6.3 and TSM VE 6.3.
We use one proxy server on Windows 2008 which have two HBAs. One is for reading data from VMware storage and the other is for wrting data to tape library.
 
I think the real issue now is with LAN Free.

LAN Free does not work fast with small flies. In this case, I still consider 135 MB as small files. If TSM will chunk the VM Ware images into bigger chunks, I believe the LAN Free backup will run faster.
 
Thanks for the replies. We are also using two MCs, one for the tape path and the other for the ctl-files going to and staying on DISK.

A small improvement could have been done by increasing txnbytelimit to 2g on the client side. A higher value is not possible with our server version (6.1.x) though. With 6.2 servers 10g is possible and should be set and in combination with 6.3 clients AND 6.3 servers txnbytelimit=30g will be possible.

I again use a storage agent on the windows server box with "hotadd"-ing the VMWare image and get datarates of around 40 MB/sec (which of course is better than mentioned above but still poor ...). VMWare copies VMs within one disk-storage-system with rates of 400 MB/sec, LTO-4 tapes can write 120 MB/sec uncompressed, so where is the bottleneck?
 
yes,If we use VBS server to backup file in LAN-Free mode, it's fast.
But when backing up VM, it's slow, so where is the bottelneck?
I'm looking forward to the answers, thanks.
 
Back
Top