Re: [ADSM-L] NFS as a storage pool volume SOLVED!!!!
2009-12-29 13:51:15
Ok, so when searching for solutions, I ran across a similar problem on
GPFS. Turns out that DIRECTIO in TSM causes severe degradation for
GPFS and NFS and iSCSI volumes.
I added the undocumented "DIRECTIO no" parameter to dsmserv.opt, and
audit is running at 60+ MB/s as expected. Hope this helps someone out
there. There is a downside though. Database performance seems to
suffer, as would be expected. Startup time for TSM doubled.
Gary
Itrus Technologies
On Dec 29, 2009, at 11:17 AM, Wanda Prather wrote:
You don't say what kind of beast this network attached storage
hardware
actually is - are we talking Netapp, EMC, other?
You need to run the performance tools that are available with it and
look at
how busy its NIC card is and what kind of performance you are
getting from
its cache.
I have seen this kind of behavior before from network attached
storage when
the backstore is SATA disk (relatively slow), and the cache is not
large
enough or cannot keep up with the demands on it.
Esp. with SATA disk It makes a very big difference whether you are
reading/writing relatively small blocks that get a large percentage
of cache
hits, or long sequential streams that have to read every byte from a
single
disk with none of the data coming from cache.
And given it's ISCSI, you'll also have to look and see if there is any
strange behavior on whatever switches the I/O is going through.
Summary: The performance problem may very well be outboard, rather
than in
TSM.
W
On Tue, Dec 29, 2009 at 11:22 AM, Gary Bowers <gbowers AT itrus DOT com>
wrote:
I have a strange performance issue that I am trying to work out
involving network attached storage being used for TSM stgpool
volumes.
The TSM server is AIX 5.3, and the network is a dedicated Gbit. We
started out using iSCSI for the storage pool volumes creating 10 X
250GB volumes and placing a single logical volume per 250GB physical
volume, and letting TSM do the load balancing. We are a small shop,
and the 30-40 MB/s performance that we were seeing in backups was
acceptable. That is until we had to audit some volumes.
For the audit, we are seeing abysmal performance. Approximately 3-5
MB/s per volume. Adding volumes increases the total throughput, but
the performance per volume remains around 3-5 MB/s.
From the command line we can get 30-60 MB/s using dd to and from an
iSCSI volume. So we did some testing on NFS.
Using NFS, we are able to get 60-80 MB/s from the AIX OS using dd
both
read and write. So we decided to create volumes on the NFS mount.
The "define vol" command ran for 10+ hours on a 100 GB volume at 2
MB/s.
Thinking it was just the def vol that was a problem, I ran a move
data
to the new volume, and it ran at 3-5 MB/s. I then ran an audit on
the
volume, and again I am only getting 3-5 MB/s.
This seems like a tuning issue inside TSM, but I could not tell you
what parameter would cause such a slow down. I have done my homework
on this, and have not found any relevant posts. If anyone has some
suggestions, I would love to hear them.
As a side note, defining a volume on the local root drive runs at
16-20 MB/s.
tsm: TSM>q option
Server Option Option Setting Server Option
Option Setting
----------------- -------------------- -----------------
--------------------
CommTimeOut 60
IdleTimeOut 15
BufPoolSize 32768
LogPoolSize 512
MessageFormat 1 Language
AMENG
Alias Halt HALT
MaxSessions 25
ExpInterval 24
ExpQuiet No
EventServer Yes
ReportRetrieve No
DISPLAYLFINFO No MirrorRead DB
Normal
MirrorRead LOG Normal MirrorWrite DB
Sequential
MirrorWrite LOG Parallel
TxnGroupMax 256
MoveBatchSize 1000 MoveSizeThresh
2048
RestoreInterval 1,440
DisableScheds No
NOBUFPREfetch No
AuditStorage Yes
REQSYSauthoutfile Yes
SELFTUNEBUFpools- No
ize
DBPAGEShadow No DBPAGESHADOWFile
dbpgshdw.bdt
MsgStackTrace On QueryAuth
None
LogWarnFullPerCe- 90
ThroughPutDataTh- 0
nt reshold
ThroughPutTimeTh- 0 NOPREEMPT
( No )
reshold
Resource Timeout 60 TEC UTF8
Events No
AdminOnClientPort Yes
NORETRIEVEDATE No
IMPORTMERGEUsed Yes
DNSLOOKUP Yes
NDMPControlPort 10,000
NDMPPortRange 0,0
SHREDding Automatic
SanRefreshTime 0
TCPPort 1500 TcpAdminport
1500
HTTPPort 1580 TCPWindowsize
64512
TCPBufsize 32768
TCPNoDelay Yes
CommMethod TCPIP
MsgInterval 1
ShmPort 1510 FileExit
UserExit FileTextExit
AssistVCRRecovery Yes AcsAccessId
AcsTimeoutX 1
AcsLockDrive No
AcsQuickInit Yes SNMPSubagentPort
1521
SNMPSubagentHost 127.0.0.1
SNMPHeartBeatInt 5
TECHost
TECPort 0
UNIQUETECevents No
UNIQUETDPTECeven- No
ts
Async I/O No
SHAREDLIBIDLE No
3494Shared No
CheckTrailerOnFr- On
ee
SANdiscovery On SSLTCPPort
SSLTCPADMINPort
SANDISCOVERYTIME- 15
Server Name:
Server host name or IP address:
Server TCP/IP port number: 1500
Crossdefine: On
Server Password Set: Yes
Server Installation Date/Time: 11/11/08 15:01:10
Server Restart Date/Time: 12/28/09 11:50:39
Authentication: On
Password Expiration Period: 9,999 Day(s)
Invalid Sign-on Attempt Limit: 0
Minimum Password Length: 0
Registration: Closed
Subfile Backup: No
Availability: Enabled
Accounting: Off
Activity Log Retention: 5 Day(s)
Activity Log Number of Records: 9243
Activity Log Size: 1 M
Activity Summary Retention Period: 30 Day(s)
License Audit Period: 1 Day(s)
Last License Audit: 12/28/09 23:50:44
Server License Compliance: Valid
Central Scheduler: Active
Maximum Sessions: 25
Maximum Scheduled Sessions: 12
Event Record Retention Period: 30 Day(s)
Client Action Duration: 5 Day(s)
Schedule Randomization Percentage: 25
Query Schedule Period: Client
Maximum Command Retries: Client
Retry Period: Client
Scheduling Modes: Any
Log Mode: Normal
Database Backup Trigger: Disabled
BufPoolSize: 32,768 K
Active Receivers: CONSOLE ACTLOG
Configuration manager?: Off
Refresh interval: 60
Last refresh date/time:
Context Messaging: Off
Table of Contents (TOC) Load Retention: 120 Minute(s)
Machine Globally Unique ID:
00.00.00.00.b0.33.11.dd.b0.f9.08.63.01.02.03.02
Archive Retention Protection: Off
Encryption Strength: AES
|
|
|