Networker

Re: [Networker] 10 GbE performance tuning on Solaris client/Storage Nodes

2009-11-12 03:58:01
Subject: Re: [Networker] 10 GbE performance tuning on Solaris client/Storage Nodes
From: Magnus Berglund <belmagnus AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 12 Nov 2009 09:56:43 +0100
Sorry for the late reply, the below info is in our documentation from Lab
testing on Solaris U6, Since I'm not a UNIX technician I cant tell what
values does what and/or if some of the values have the potential of breaking
your system.

Most of the info were collected from various tuning guides for Legato
Networker and Datadomain when using Solaris 10 and nfs hard mounts.

*Jumbo frames/MTU 9000*

Two changes are needed to get this to work. Enabling jumbo frames in the OS
and setting the mtu
on the network interface.

In file /kernel/drv/nxge.conf add the following line (or uncomment)
accept_jumbo = 1;

And in file /etc/hostname.* add the
hostname mtu 9000

*TCP/IP and kernel parameters for 10GbE*
A lot of parameters were in need of tuning to get decent performance from
10GbE.

The file /etc/init.d/nddconfig had the following changes:
tcp_max_buf=L:2097152
tcp_cwnd_max=L:2097152
tcp_xmit_hiwat=L:400000
tcp_recv_hiwat=L:400000

The file /etc/system had the following kernel parameters set:
set ip:ip_soft_rings_cnt=16
set nxge:nxge_jumbo_enable=1
set nxge:nxge_bcopy_thresh=1024
set ip:tcp_squeue_wput=1
set ddi_msix_alloc_limit=8
set hires_tick=1

Changes to /etc/system kernel parameters to enable 1MB block size in NFS
set nfs:nfs3_bsize=1048576
set nfs:nfs3_max_transfer_size_cots=1048576

On Mon, Nov 9, 2009 at 8:44 PM, Ray Pengelly <pengelly AT queensu DOT ca> wrote:

> Could you share any ndd or /etc/system tunings you have done?
>
> I have discussed this topic with a colleague who is in the HPC side of
> things with M9000 servers and he also stated jumbo frames give the biggest
> speedup.
>
> Since this is a direct connection jumbo frames shouldn't be an issue for me
> either.
>
> Ray
>
> Magnus Berglund wrote:
>
>> I have had some luck with getting better throughput after enabling jumbo
>> frames and settings the mtu size to 9000. The value 9000 is not an optimal
>> value for our network card in the box according to our UNIX techs but we
>> use
>> it due a number of different reasons.
>>
>> We have also done some other tuning but the jumboframes and the mtu was
>> the
>> big difference in our environment for getting the speed above 100 mb/s
>>
>> SUN T5140, Solaris 10 (u7) with 10 GB, 2 x 10 GB in failover
>> (active/passive)
>> Using a DD 690 as an adv file type device (NFS), NW 4 x 1 GB (all active)
>>
>> During testing we saw that we had to use a number of mountpoints to get
>> the
>> best performace since each "nfs stream" topped at "1 gb speed" even though
>> the link is 10 GB between the storage node and the target system.
>>
>> At the moment I can write from a Legato storage node around 380 mb/s
>> against
>> the DD box and that is more or less what it is able to receive with it's
>> current 4 x 1 GB interfaces using NFS.
>>
>> //Magnus
>>
>> On Tue, Nov 3, 2009 at 8:54 PM, Ray Pengelly <pengelly AT queensu DOT ca> 
>> wrote:
>>
>>
>>
>>> Hey everyone,
>>>
>>> I currently have an Sun M5000 with an ixgbe 10 GbE card directly
>>> connected
>>> to a Sun X4540 system with an nxge 10 GbE.
>>>
>>> I am able to read from disk at roughly 930 MB/s using the usam tool:
>>>
>>> # time uasm -s ./testfile10g >/dev/null
>>>
>>> real    0m11.785s
>>> user    0m0.382s
>>> sys     0m11.396s
>>>
>>> If i do this over NFS I am only able to get about 104 MB/sec
>>>
>>> # time uasm -s ./testfile10g >/mnt/usam-out
>>>
>>> real    1m38.888s
>>> user    0m0.598s
>>> sys     0m44.980s
>>>
>>> Using Networker I see roughly the same numbers with the X4540 acting a a
>>> Storage Node using adv_file devices on a zpool. I know both the client
>>> and
>>> server filesystems are not the bottleneck.
>>>
>>> Both links show up as 10000 full duplex via dladm show-dev.
>>>
>>> Has anyone been through performance tuning 10 GbE on Solaris 10? Any
>>> notes/
>>> recipes?
>>>
>>> Anyone gotten better throughput than this?
>>>
>>> Ray
>>>
>>>
>>>
>>>
>>>
>>> --
>>> Ray Pengelly
>>> Technical Specialist
>>> Queen's University - IT Services
>>> pengelly AT queensu DOT ca
>>> (613) 533-2034
>>>
>>>
>>> To sign off this list, send email to listserv AT listserv.temple DOT edu and
>>> type
>>> "signoff networker" in the body of the email. Please write to
>>> networker-request AT listserv.temple DOT edu if you have any problems with 
>>> this
>>> list. You can access the archives at
>>> http://listserv.temple.edu/archives/networker.html or
>>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>>>
>>>
>>>
>>
>> To sign off this list, send email to listserv AT listserv.temple DOT edu and
>> type "signoff networker" in the body of the email. Please write to
>> networker-request AT listserv.temple DOT edu if you have any problems with 
>> this
>> list. You can access the archives at
>> http://listserv.temple.edu/archives/networker.html or
>> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>>
>>
>
>
> --
> Ray Pengelly
> Technical Specialist
> Queen's University - IT Services
> pengelly AT queensu DOT ca
> (613) 533-2034
>
>

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER