ADSM-L

Re: [ADSM-L] getting performance from nfs storage over 10 gb link

2016-02-04 15:23:55
Subject: Re: [ADSM-L] getting performance from nfs storage over 10 gb link
From: "Lee, Gary" <glee AT BSU DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 4 Feb 2016 20:22:21 +0000
For Ken:

Can you send me the pdf itself?
Can't seem to get it to display or download here.



-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Mueller, Ken
Sent: Wednesday, February 03, 2016 3:14 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] getting performance from nfs storage over 10 gb link

I will throw this document into the ring as well. 

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.641.1965&rep=re
p1&type=pdf#page=169

Even though it's a bit dated, it hits on a lot of the elements that go
into getting the most performance out of your 10G network adapter. 

As previously noted, the ability for your NFS server to deliver data to
come anywhere close to filling a 10G pipe will depend heavily on its
disk configuration and I/O access patterns.  That being said, you
mentioned the performance is worse at 10G than with a 1G connection.
Have you looked at your CPU load while your migration/reclamation
processes are running?   You stated you can get about 7gb/s throughput
using iperf3 (what options? what target?) but only 900mb/s running
migration/reclamation.  Iperf's sole job is to hammer packets across the
network, so it has a different behavior on the local machine with
respect to buffering/context switching/off-load processing than your
production workload.  If the tools show your network can transit data at
desired speed, then it would seem to point to the local stack.  Based on
the couple of observations you've provided, I would look into whether
the buffering is sufficient (across the board - tcp window, send/receive
buffers, tx/rx queue len, etc).  Remember, packets are theoretically
coming in 10 times faster than before - if your CPUs are busy doing TSM
things, their caches aren't primed for packet processing.  If a packet
gets dropped anywhere in the pipeline due to overrun, that flow (tcp/NFS
session) will come to a halt until the missing packet gets resent - that
will really kill throughput.  (That's why those offload settings can
help - some workloads seem to benefit from them, others not so much -
testing with your production workload is the key).  

Also, it's helpful to look at a packet trace.  Observe the tcp window
sizes - if they are hitting 0, or if you're seeing retransmissions, etc.
that can clue you into what's going on.  Remember, too, the problem
could be on the remote end as well.

Good luck and let us know how you make out!
-Ken


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Lee, Gary
Sent: Wednesday, February 03, 2016 8:15 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: getting performance from nfs storage over 10 gb link

To Mike:

Thanks for the link, hadn't seen that.

Also, the nfs server is a dedicated SAN head end, so I have little to no
control of its nfs parms.

To 

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Skylar Thompson
Sent: Tuesday, February 02, 2016 4:43 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] getting performance from nfs storage over 10 gb
link

Hi Gary,

To Skylar:

I will check into those options.

To all:

I can say that tsm performance was better when the same storage was
mounted with the same mount options using a 1 gb adapter.
Strange, I know, but that's the riddle I'm dealing with.

Thanks for everything so far. Will keep you posted on results.

We don't use NFS for our TSM servers, but we have been struggling with
NFS over 10GbE in other areas. While not a universal solution, we've
gotten significant performance improvements by disabling the following
NIC offload
options:

gro
lro
rxvlan
txvlan
rxhash

For instance, you can disable gro with

ethtool --offload eth0 gro off

(assuming eth0 is your NIC)

There's a bunch more we haven't had a chance to play with, but hopefully
that's a starting point.

On Tue, Feb 02, 2016 at 05:54:46PM +0000, Lee, Gary wrote:
> Tsm server 6.4.3,
> RHEL 6.7
> 4 Dual-Core AMD Opteron(tm) Processor 8220 SE CPUs
> 128 gB memory
>
> I recently installed an intel 10 gB Ethernet card.
> Iperf3 test shown below gives around 7 gb throughput.
> However, when running multiple migration and reclamation processes,
and watching network traffic through an independent tool, I cannot get
over the 900 mb/s threshold.
>
> The tsm storage pools are on a file system mounted as follows:
>
> Nxback-Pool02.servers.bsu.edu:/volumes/Pool02/TSMBackup/tsm02 
> /tsminst1/storage nfs 
> rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,proto=t
> cp,timeo=600,retrans=2,sec=sys 0 0
>
> I am running out of options.
>
> I don't expect to see the full throughput, as disk speeds will have a
good deal of impact.
>
> Any ideas would be helpful.

--
-- Skylar Thompson (skylar2 AT u.washington DOT edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine