ADSM-L

Re: [ADSM-L] NFS as a storage pool volume SOLVED!!!!

2009-12-29 13:51:15
Subject: Re: [ADSM-L] NFS as a storage pool volume SOLVED!!!!
From: Gary Bowers <gbowers AT ITRUS DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 29 Dec 2009 12:50:14 -0600
Ok, so when searching for solutions, I ran across a similar problem on
GPFS.  Turns out that DIRECTIO in TSM causes severe degradation for
GPFS and NFS and iSCSI volumes.

I added the undocumented "DIRECTIO no" parameter to dsmserv.opt, and
audit is running at 60+ MB/s as expected.  Hope this helps someone out
there.  There is a downside though.  Database performance seems to
suffer, as would be expected.  Startup time for TSM doubled.

Gary
Itrus Technologies

On Dec 29, 2009, at 11:17 AM, Wanda Prather wrote:

You don't say what kind of beast this network attached storage
hardware
actually is - are we talking Netapp, EMC, other?

You need to run the performance tools that are available with it and
look at
how busy its NIC card is and what kind of performance you are
getting from
its cache.

I have seen this kind of behavior before from network attached
storage when
the backstore is SATA disk (relatively slow), and the cache is not
large
enough or cannot keep up with the demands on it.

Esp. with SATA disk It makes a very big difference whether you are
reading/writing relatively small blocks that get a large percentage
of cache
hits, or long sequential streams that have to read every byte from a
single
disk with none of the data coming from cache.

And given it's ISCSI, you'll also have to look and see if there is any
strange behavior on whatever switches the I/O is going through.

Summary:  The performance problem may very well be outboard, rather
than in
TSM.

W





On Tue, Dec 29, 2009 at 11:22 AM, Gary Bowers <gbowers AT itrus DOT com>
wrote:

I have a strange performance issue that I am trying to work out
involving network attached storage being used for TSM stgpool
volumes.

The TSM server is AIX 5.3, and the network is a dedicated Gbit.  We
started out using iSCSI for the storage pool volumes creating 10 X
250GB volumes and placing a single logical volume per 250GB physical
volume, and letting TSM do the load balancing.  We are a small shop,
and the 30-40 MB/s performance that we were seeing in backups was
acceptable.  That is until we had to audit some volumes.

For the audit, we are seeing abysmal performance.  Approximately 3-5
MB/s per volume.  Adding volumes increases the total throughput, but
the performance per volume remains around 3-5 MB/s.

From the command line we can get 30-60 MB/s using dd to and from an
iSCSI volume.  So we did some testing on NFS.

Using NFS, we are able to get 60-80 MB/s from the AIX OS using dd
both
read and write.  So we decided to create volumes on the NFS mount.
The "define vol" command ran for 10+ hours on a 100 GB volume at 2
MB/s.

Thinking it was just the def vol that was a problem, I ran a move
data
to the new volume, and it ran at 3-5 MB/s.  I then ran an audit on
the
volume, and again I am only getting 3-5 MB/s.

This seems like a tuning issue inside TSM, but I could not tell you
what parameter would cause such a slow down.  I have done my homework
on this, and have not found any relevant posts.  If anyone has some
suggestions, I would love to hear them.

As a side note, defining a volume on the local root drive runs at
16-20 MB/s.

tsm: TSM>q option

Server Option         Option Setting           Server Option
Option Setting
-----------------     --------------------     -----------------
--------------------
CommTimeOut           60
IdleTimeOut           15
BufPoolSize           32768
LogPoolSize           512
MessageFormat         1                        Language
AMENG
Alias Halt            HALT
MaxSessions           25
ExpInterval           24
ExpQuiet              No
EventServer           Yes
ReportRetrieve        No
DISPLAYLFINFO         No                       MirrorRead DB
Normal
MirrorRead LOG        Normal                   MirrorWrite DB
Sequential
MirrorWrite LOG       Parallel
TxnGroupMax           256
MoveBatchSize         1000                     MoveSizeThresh
2048
RestoreInterval       1,440
DisableScheds         No
NOBUFPREfetch         No
AuditStorage          Yes
REQSYSauthoutfile     Yes
SELFTUNEBUFpools-     No
                                              ize
DBPAGEShadow          No                       DBPAGESHADOWFile
dbpgshdw.bdt
MsgStackTrace         On                       QueryAuth
None
LogWarnFullPerCe-     90
ThroughPutDataTh-     0
nt                                             reshold
ThroughPutTimeTh-     0                        NOPREEMPT
( No )
reshold
Resource Timeout      60                       TEC UTF8
Events       No
AdminOnClientPort     Yes
NORETRIEVEDATE        No
IMPORTMERGEUsed       Yes
DNSLOOKUP             Yes
NDMPControlPort       10,000
NDMPPortRange         0,0
SHREDding             Automatic
SanRefreshTime        0
TCPPort               1500                     TcpAdminport
1500
HTTPPort              1580                     TCPWindowsize
64512
TCPBufsize            32768
TCPNoDelay            Yes
CommMethod            TCPIP
MsgInterval           1
ShmPort               1510                     FileExit
UserExit                                       FileTextExit
AssistVCRRecovery     Yes                      AcsAccessId
AcsTimeoutX           1
AcsLockDrive          No
AcsQuickInit          Yes                      SNMPSubagentPort
1521
SNMPSubagentHost      127.0.0.1
SNMPHeartBeatInt      5
TECHost
TECPort               0
UNIQUETECevents       No
UNIQUETDPTECeven-     No
                                              ts
Async I/O             No
SHAREDLIBIDLE         No
3494Shared            No
CheckTrailerOnFr-     On
                                              ee
SANdiscovery          On                       SSLTCPPort
SSLTCPADMINPort
SANDISCOVERYTIME-     15

                         Server Name:
      Server host name or IP address:
           Server TCP/IP port number: 1500
                         Crossdefine: On
                 Server Password Set: Yes
       Server Installation Date/Time: 11/11/08   15:01:10
            Server Restart Date/Time: 12/28/09   11:50:39
                      Authentication: On
          Password Expiration Period: 9,999 Day(s)
       Invalid Sign-on Attempt Limit: 0
             Minimum Password Length: 0
                        Registration: Closed
                      Subfile Backup: No
                        Availability: Enabled
                          Accounting: Off
              Activity Log Retention: 5 Day(s)
      Activity Log Number of Records: 9243
                   Activity Log Size: 1 M
   Activity Summary Retention Period: 30 Day(s)
                License Audit Period: 1 Day(s)
                  Last License Audit: 12/28/09   23:50:44
           Server License Compliance: Valid
                   Central Scheduler: Active
                    Maximum Sessions: 25
          Maximum Scheduled Sessions: 12
       Event Record Retention Period: 30 Day(s)
              Client Action Duration: 5 Day(s)
   Schedule Randomization Percentage: 25
               Query Schedule Period: Client
             Maximum Command Retries: Client
                        Retry Period: Client
                    Scheduling Modes: Any
                            Log Mode: Normal
             Database Backup Trigger: Disabled
                         BufPoolSize: 32,768 K
                    Active Receivers: CONSOLE ACTLOG
              Configuration manager?: Off
                    Refresh interval: 60
              Last refresh date/time:
                   Context Messaging: Off
Table of Contents (TOC) Load Retention: 120 Minute(s)
          Machine Globally Unique ID:
00.00.00.00.b0.33.11.dd.b0.f9.08.63.01.02.03.02
        Archive Retention Protection: Off
                 Encryption Strength: AES