Veritas-bu

Re: [Veritas-bu] VMware/SAN backups - Speed Results

2011-03-06 16:30:05
Subject: Re: [Veritas-bu] VMware/SAN backups - Speed Results
From: "Marelas, Peter" <Peter.Marelas AT team.telstra DOT com>
To: Jorge Fábregas <jorge.fabregas AT gmail DOT com>, "Veritas-bu AT mailman.eng.auburn DOT edu" <Veritas-bu AT mailman.eng.auburn DOT edu>
Date: Mon, 7 Mar 2011 08:29:54 +1100
I would be careful with your testing strategy. When performing this type of 
test you need to avoid buffer bias. This is where on subsequent runs data is 
being read from memory / fs buffers versus from disk and this is attributing to 
the performance gains.

Regards
Peter Marelas

-----Original Message-----
From: veritas-bu-bounces AT mailman.eng.auburn DOT edu 
[mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Jorge 
Fábregas
Sent: Monday, 7 March 2011 3:49 AM
To: Veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] VMware/SAN backups - Speed Results

Hi everyone,

I recently finished doing a hardware-refresh of our master/media server.
 The main reason for the refresh was that our previous server was really
old (with 1Gbps HBAs) and we wanted to start doing VMware backups thru
the SAN (thru our 8Gbps SAN fabric).  I just finished doing some tests
after tweaking NUMBER_DATA_BUFFERS and SIZE_DATA_BUFFERS and wanted to
share with you the results but first, some quick facts:

1) Master server is also Media Server running Win2008 R2 along with NBU
7.1 (from FA program) so this is our VMware Backup Host.

2) tape-drives are LTO4  (HP Storageworks 1840)

3) One 8Gbps HBA is zoned to the tape-drives and the other 8Gbps HBA is
zoned to disk-array.  Each HBA goes to a different Fabric.  Basically
I'm reading from disk with one HBA and writing to tape with the other HBA.

4) I'm not using MPIO on the server;  the LUNs (VMFS) are presented to
the server thru only 1 target so the servers just see one LUN for each
one it is presented to it.  Yes,  we have no redundancy for our paths.

5) Backup payload:  1 VM with a single VMDK of 22.3 GB.

6) The tests were performed on an isolated datastore (no other VMs were
running there).  I know this is not a realistic scenario (where you have
dozens of VMs running on a datastore). My main purpose on this test was
to push the limits of our tape-drives (and not introduce any external
factors).

7) We're using vSphere 4.1

8) storage-unit has a fragment-size of 50GB.  This is a storage-unit
dedicated to VMware backups (full VM backups).  For our regular backups
(file backups) our our physical servers we use 2GB fragment size.


Here are the results:

-----------------------------------------------------------------
NUMBER_DATA_BUFFERS: 32, SIZE_DATA_BUFFERS: 64k ----> 62 Mb/sec

bptm   waited for full buffer 10926 times, delayed 22060 times.
bpbkar waited 10187 times for empty buffer, delayed 10520 times.

write time: 00:05:40

----------------------------------------------------------------
NUMBER_DATA_BUFFERS: 32, SIZE_DATA_BUFFERS: 256k ---> 144 Mb/sec

bptm   waited for full buffer 882 times, delayed 2173 times.
bpbkar waited 2345 times for empty buffer, delayed 2509 times.

write time: 00:02:59

----------------------------------------------------------------
NUMBER_DATA_BUFFERS: 32, SIZE_DATA_BUFFERS: 512k ---> 151 Mb/sec

bptm   waited for full buffer 187 times, delayed 1230 times
bpbkar waited 2942 times for empty buffer, delayed 3121 times.

write time: 00:02:44

---------------------------------------------------------------
NUMBER_DATA_BUFFERS: 32, SIZE_DATA_BUFFERS: 1MB ----->  N/A

I got the following error:

3/6/2011 9:49:25 AM - Error bptm(pid=3284) The tape device at index -1
has a maximum block size of 524288 bytes, a buffer size of 1048576
cannot be used


Ok, I reached the maximum block size for my tape-drive so from now on
I'll only change the number of data-buffers and I'll leave
size_data_buffers at 512k

---------------------------------------------------------------
NUMBER_DATA_BUFFERS: 64, SIZE_DATA_BUFFERS: 512k ----> 152 Mb/sec

bptm   waited for full buffer 184 times, delayed 1241 times
bpbkar waited 2614 times for empty buffer, delayed 2792 times.

write time: 00:02:43

----------------------------------------------------------------
NUMBER_DATA_BUFFERS: 96, SIZE_DATA_BUFFERS: 512k ----> 153 Mb/sec

bptm   waited for full buffer 157 times, delayed 1037 times
bpbkar waited 2667 times for empty buffer, delayed 2874 times.

write time: 00:02:45

-----------------------------------------------------------------
NUMBER_DATA_BUFFERS: 256, SIZE_DATA_BUFFERS: 512k ---> 169 Mb/sec

bptm   waited for full buffer 18 times, delayed 63 times
bpbkar waited 2712 times for empty buffer, delayed 2925 times.

write time: 00:02:29

------------------------------------------------------------------
NUMBER_DATA_BUFFERS: 512, SIZE_DATA_BUFFERS: 512k ---> 170 Mb/sec

bptm   waited for full buffer 4 times, delayed 9 times
bpbkar waited 2637 times for empty buffer, delayed 2847 times.

write time: 00:02:28

-------------------------------------------------------------------
NUMBER_DATA_BUFFERS: 576, SIZE_DATA_BUFFERS: 512k --> 169 Mb/sec

bptm   waited for full buffer 1 times, delayed 9 times
bpbkar waited 2632 times for empty buffer, delayed 2870 times.

write time: 00:02:29

-------------------------------------------------------------------
NUMBER_DATA_BUFFERS: 608, SIZE_DATA_BUFFERS: 512k --> 168 Mb/sec

bptm   waited for full buffer 0 times, delayed 0 times
bpbkar waited 2593 times for empty buffer, delayed 2857 times.

write time: 00:02:29

-----------------------------------------------------------
NUMBER_DATA_BUFFERS: 640, SIZE_DATA_BUFFERS: 512k --> 169 Mb/sec

bptm   waited for full buffer 0 times, delayed 0 times
bpbkar waited 2678 times for empty buffer, delayed 2904 times.

write time: 00:02:29

---------------------------------------------------------------------
NUMBER_DATA_BUFFERS: 768, SIZE_DATA_BUFFERS: 512k --> 174Mb/sec

bptm   waited for full buffer 0 times, delayed 0 times
bpbkar waited 2514 times for empty buffer, delayed 2646 times.

write time: 00:02:23

---------------------------------------------------
REPEATING PREVIOUS TEST:
NUMBER_DATA_BUFFERS: 768, SIZE_DATA_BUFFERS: 512k --> 169 Mb/sec

bptm   waited for full buffer 0 times, delayed 0 times
bpbkar waited 2618 times for empty buffer, delayed 2844 times.

write time: 00:02:29

------------------------------------------------------------------------
NUMBER_DATA_BUFFERS: 1024, SIZE_DATA_BUFFERS: 512k --> 169 Mb/sec

bptm   waited for full buffer 0 times, delayed 0 times
bpbkar waited 2683 times for empty buffer, delayed 2905 times.

write time: 00:02:28
------------------------------------------------------------------------------

As you can see, I reached my sweet spot with NUMBER_DATA_BUFFERS: 608
(and 512k block-size) where bptm (parent) didn't have to wait for full
buffers to pass along to the tape device driver.

This was my very first time tweaking any performance parameter at all in
Netbackup and as you can imagine, I'm really suprised/amazed at the
speeds I'm getting.  I never thought I could reach 170 MB/sec !   I'm
rerally looking forward to start doing the VMware backups (just in a
couple of days).

Now, I need to do some restore tests :)

All the best,
Jorge

_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu