Networker

[Networker] Performance problems (Excessive Solaris 10 pollsys calls with nsrmmd).

2010-10-11 20:23:07
Subject: [Networker] Performance problems (Excessive Solaris 10 pollsys calls with nsrmmd).
From: Yaron Zabary <yaron AT ARISTO.TAU.AC DOT IL>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 12 Oct 2010 02:20:56 +0200
Hi all,

My setup is a Sun T1000 running Solaris 10 U4 with recommended patches from 2/2010 (Generic_142900-03). Networker 7.5.2. FC connected Overland NEO8000 with 4 LTO-4 drives. One storage node running on x4500 Solaris 10 U8 with 30TB of AFTD on ZFS. I have been struggling with performance problems when staging from the x4500 to the tape drives on the T1000. Recently I found out the following:
# sar -c 1 10

SunOS legato.tau.ac.il 5.10 Generic_142900-03 sun4v    10/12/2010

02:15:02 scall/s sread/s swrit/s  fork/s  exec/s rchar/s wchar/s
02:15:03   21500    6207     187    0.00    0.00 73438304  540399
02:15:04   21160    6132      46    0.00    0.00 67140728  107171
02:15:05   17720    5203      37    0.00    0.00 45364104   56006
02:15:06   20566    6036      37    0.00    0.00 61809376   61430
02:15:07   28335    8159     110    0.00    0.00 93447008  519992
02:15:09   25388    7337      35    0.00    0.00 89129144   47190
02:15:10   27083    7837      71    0.00    0.00 91929992  141210
02:15:11   26471    7691      29    0.00    0.00 91041920   43699
02:15:12   26439    7656      47    0.00    0.00 90561864  111247
02:15:13   26514    7653     203    0.00    0.00 93589112  988914

Average    24111    6989      80    0.00    0.00 79720984  260102


  With the following activity of nsrmmd:

# dtrace -n 'syscall:::entry /execname == "nsrmmd"/ {@[probefunc] = count(); } tick-5sec {printa(@); clear(@);}'
dtrace: description 'syscall:::entry ' matched 234 probes
CPU     ID                    FUNCTION:NAME
  6  55495                       :tick-5sec
  writev                                                            4
  kaio                                                           6732
  readv                                                         21647
  pollsys                                                       24447
  gtime                                                         24451

  6  55495                       :tick-5sec
  writev                                                            2
  kaio                                                           3588
  readv                                                         11020
  pollsys                                                       12482
  gtime                                                         12484

 The output of truss -v pollsys -p 25398 goes like that:

/1:     readv(11, 0xFFFFFFFF7FFE3660, 2)                = 57828
/1:     pollsys(0x100A3A088, 3, 0xFFFFFFFF7FFFC040, 0x00000000) = 1
/1:             fd=6  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=8  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=11 ev=POLLIN|POLLRDNORM rev=POLLIN|POLLRDNORM
/1:             timeout: 5.000000000 sec
/1:     readv(11, 0xFFFFFFFF7FFE3660, 2)                = 71576
/1: kaio(9, 0x100CDE1D0, 0x00020000, 0x00000000, 0x4FCB6952, 0x100B715B0) = 0 /1: kaio(12, 0x100CDDFF0, 0x00000000, 0x00000002, 0x00000096, 0x0000000A) = 0
/1:     time()                                          = 1286842696
/1:     pollsys(0x100A3A088, 3, 0xFFFFFFFF7FFFC040, 0x00000000) = 1
/1:             fd=6  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=8  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=11 ev=POLLIN|POLLRDNORM rev=POLLIN|POLLRDNORM
/1:             timeout: 5.000000000 sec
/1:     readv(11, 0xFFFFFFFF7FFEBBA0, 2)                = 59612
/1:     time()                                          = 1286842696
/1:     pollsys(0x100A3A088, 3, 0xFFFFFFFF7FFFC040, 0x00000000) = 1
/1:             fd=6  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=8  ev=POLLIN|POLLRDNORM rev=0
/1:             fd=11 ev=POLLIN|POLLRDNORM rev=POLLIN|POLLRDNORM
/1:             timeout: 5.000000000 sec
/1:     readv(11, 0xFFFFFFFF7FFE3660, 2)                = 72968
/1: kaio(9, 0x100CDDFF0, 0x00020000, 0x00000000, 0x4FCB6952, 0x100B715B0) = 0 /1: kaio(12, 0x100CDE090, 0x00000000, 0x00000002, 0x00000096, 0x0000000A) = 0
/1:     time()                                          = 1286842696
/1:     readv(11, 0xFFFFFFFF7FFE3660, 2)                = 52764


  where fd 6 8 and 11 are sockets.


Is anyone aware of any Solaris/Networker bug or tuning that can help with this situation ?

--

-- Yaron.

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>
  • [Networker] Performance problems (Excessive Solaris 10 pollsys calls with nsrmmd)., Yaron Zabary <=