You don't say what kind of beast this network attached storage hardware
actually is - are we talking Netapp, EMC, other?
You need to run the performance tools that are available with it and look at
how busy its NIC card is and what kind of performance you are getting from
its cache.
I have seen this kind of behavior before from network attached storage when
the backstore is SATA disk (relatively slow), and the cache is not large
enough or cannot keep up with the demands on it.
Esp. with SATA disk It makes a very big difference whether you are
reading/writing relatively small blocks that get a large percentage of cache
hits, or long sequential streams that have to read every byte from a single
disk with none of the data coming from cache.
And given it's ISCSI, you'll also have to look and see if there is any
strange behavior on whatever switches the I/O is going through.
Summary: The performance problem may very well be outboard, rather than in
TSM.
W
On Tue, Dec 29, 2009 at 11:22 AM, Gary Bowers <gbowers AT itrus DOT com> wrote:
> I have a strange performance issue that I am trying to work out
> involving network attached storage being used for TSM stgpool volumes.
>
> The TSM server is AIX 5.3, and the network is a dedicated Gbit. We
> started out using iSCSI for the storage pool volumes creating 10 X
> 250GB volumes and placing a single logical volume per 250GB physical
> volume, and letting TSM do the load balancing. We are a small shop,
> and the 30-40 MB/s performance that we were seeing in backups was
> acceptable. That is until we had to audit some volumes.
>
> For the audit, we are seeing abysmal performance. Approximately 3-5
> MB/s per volume. Adding volumes increases the total throughput, but
> the performance per volume remains around 3-5 MB/s.
>
> From the command line we can get 30-60 MB/s using dd to and from an
> iSCSI volume. So we did some testing on NFS.
>
> Using NFS, we are able to get 60-80 MB/s from the AIX OS using dd both
> read and write. So we decided to create volumes on the NFS mount.
> The "define vol" command ran for 10+ hours on a 100 GB volume at 2 MB/s.
>
> Thinking it was just the def vol that was a problem, I ran a move data
> to the new volume, and it ran at 3-5 MB/s. I then ran an audit on the
> volume, and again I am only getting 3-5 MB/s.
>
> This seems like a tuning issue inside TSM, but I could not tell you
> what parameter would cause such a slow down. I have done my homework
> on this, and have not found any relevant posts. If anyone has some
> suggestions, I would love to hear them.
>
> As a side note, defining a volume on the local root drive runs at
> 16-20 MB/s.
>
> tsm: TSM>q option
>
> Server Option Option Setting Server Option
> Option Setting
> ----------------- -------------------- -----------------
> --------------------
> CommTimeOut 60 IdleTimeOut 15
> BufPoolSize 32768 LogPoolSize 512
> MessageFormat 1 Language
> AMENG
> Alias Halt HALT MaxSessions 25
> ExpInterval 24 ExpQuiet No
> EventServer Yes ReportRetrieve No
> DISPLAYLFINFO No MirrorRead DB
> Normal
> MirrorRead LOG Normal MirrorWrite DB
> Sequential
> MirrorWrite LOG Parallel TxnGroupMax 256
> MoveBatchSize 1000 MoveSizeThresh
> 2048
> RestoreInterval 1,440 DisableScheds No
> NOBUFPREfetch No AuditStorage Yes
> REQSYSauthoutfile Yes SELFTUNEBUFpools- No
> ize
> DBPAGEShadow No DBPAGESHADOWFile
> dbpgshdw.bdt
> MsgStackTrace On QueryAuth
> None
> LogWarnFullPerCe- 90 ThroughPutDataTh- 0
> nt reshold
> ThroughPutTimeTh- 0 NOPREEMPT
> ( No )
> reshold
> Resource Timeout 60 TEC UTF8 Events No
> AdminOnClientPort Yes NORETRIEVEDATE No
> IMPORTMERGEUsed Yes DNSLOOKUP Yes
> NDMPControlPort 10,000 NDMPPortRange 0,0
> SHREDding Automatic SanRefreshTime 0
> TCPPort 1500 TcpAdminport
> 1500
> HTTPPort 1580 TCPWindowsize
> 64512
> TCPBufsize 32768 TCPNoDelay Yes
> CommMethod TCPIP MsgInterval 1
> ShmPort 1510 FileExit
> UserExit FileTextExit
> AssistVCRRecovery Yes AcsAccessId
> AcsTimeoutX 1 AcsLockDrive No
> AcsQuickInit Yes SNMPSubagentPort
> 1521
> SNMPSubagentHost 127.0.0.1 SNMPHeartBeatInt 5
> TECHost TECPort 0
> UNIQUETECevents No UNIQUETDPTECeven- No
> ts
> Async I/O No SHAREDLIBIDLE No
> 3494Shared No CheckTrailerOnFr- On
> ee
> SANdiscovery On SSLTCPPort
> SSLTCPADMINPort SANDISCOVERYTIME- 15
>
> Server Name:
> Server host name or IP address:
> Server TCP/IP port number: 1500
> Crossdefine: On
> Server Password Set: Yes
> Server Installation Date/Time: 11/11/08 15:01:10
> Server Restart Date/Time: 12/28/09 11:50:39
> Authentication: On
> Password Expiration Period: 9,999 Day(s)
> Invalid Sign-on Attempt Limit: 0
> Minimum Password Length: 0
> Registration: Closed
> Subfile Backup: No
> Availability: Enabled
> Accounting: Off
> Activity Log Retention: 5 Day(s)
> Activity Log Number of Records: 9243
> Activity Log Size: 1 M
> Activity Summary Retention Period: 30 Day(s)
> License Audit Period: 1 Day(s)
> Last License Audit: 12/28/09 23:50:44
> Server License Compliance: Valid
> Central Scheduler: Active
> Maximum Sessions: 25
> Maximum Scheduled Sessions: 12
> Event Record Retention Period: 30 Day(s)
> Client Action Duration: 5 Day(s)
> Schedule Randomization Percentage: 25
> Query Schedule Period: Client
> Maximum Command Retries: Client
> Retry Period: Client
> Scheduling Modes: Any
> Log Mode: Normal
> Database Backup Trigger: Disabled
> BufPoolSize: 32,768 K
> Active Receivers: CONSOLE ACTLOG
> Configuration manager?: Off
> Refresh interval: 60
> Last refresh date/time:
> Context Messaging: Off
> Table of Contents (TOC) Load Retention: 120 Minute(s)
> Machine Globally Unique ID:
> 00.00.00.00.b0.33.11.dd.b0.f9.08.63.01.02.03.02
> Archive Retention Protection: Off
> Encryption Strength: AES
>
|