If you have anymore questions about my system layout, contact
me:
1. The server is a Sun 4500 with 4 400MHz Sparc III procs and 4GB RAM.
Attached to the system is 1 D100 disk array used to hold the OS. All of
the TSM volumes are held on A5200 fiber channel arrays (3 of them
containing 66 disk drives).
2. tsm: I02SV1000>q dbvol f=d
Volume Name Copy Volume Name Copy Volume Name
Copy Available Allocated
(Copy 1) Status (Copy 2) Status (Copy 3)
Status Space Space
(MB) (MB) (MB)
---------------- ------ ---------------- ------
---------------- ------ --------- --------- --------
/adsmdb2/db1a Sync'd /adsmdb2m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb3/db1a Sync'd /adsmdb3m/db1am Sync'd
Undef- 8,500 8,352
ined
/adsmdb4/db1a Sync'd /adsmdb4m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb5/db1a Sync'd /adsmdb5m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb10/db1a Sync'd /adsmdb10m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb9/db1a Sync'd /adsmdb9m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb8/db1a Sync'd /adsmdb8m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb7/db1a Sync'd /adsmdb7m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb6/db1a Sync'd /adsmdb6m/db1am Sync'd
Undef- 8,352 8,352
ined
/adsmdb1/db1a Sync'd /adsmdb1m/db1am Sync'd
Undef- 8,352 8,352
ined
tsm: I02SV1000>q logvol f=d
Volume Name Copy Volume Name Copy Volume Name
Copy Available Allocated
(Copy 1) Status (Copy 2) Status (Copy 3)
Status Space Space
(MB) (MB) (MB)
---------------- ------ ---------------- ------
---------------- ------ --------- --------- --------
/adsmlog2/log4 Sync'd /adsmlog2m/log4m Sync'd
Undef- 300 300
ined
/adsmlog1/log2 Sync'd /adsmlog1m/log2m Sync'd
Undef- 100 100
ined
/adsmlog1/log1 Sync'd /adsmlog1m/log1m Sync'd
Undef- 4,096 4,096
ined
/adsmlog1/log4 Sync'd Undef-
Undef- 200 200
ined
ined
tsm: I02SV1000>q vol
Volume Name Storage Device Estimated
Pct Volume
Pool Name Class Name Capacity
Util Status
(MB)
------------------------ ----------- ---------- ---------
----- --------
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
64.1 On-Line
mdata1
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
72.0 On-Line
mdata2
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
47.3 On-Line
mdata3
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
56.4 On-Line
mdata4
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
49.7 On-Line
mdata5
/dev/vx/rdsk/datadg/ads- BACKUPPOOL DISK 36,864.0
79.0 On-Line
3. File system is UFS and VXFS (for all TSM volumes).
4. RAID is software with Veritas Volume Manager.
5. Each DB and Log volume is on it's own disk group. The unfortunate
problem is that they are on mounted file system partitions instead of
raw volumes. In doing benchmark testing, this had a dramatic effect on
backup performance. The main issue seemed to be the diskpool volumes
which were easy to convert to raw, at least compared to doing the DB
volumes, so I have already done that several weeks ago. I am planning
on doing the DB and log vols soon. I did not see any performance gain
with backups of many large files, just on medium and large ones.
6. Yes, the volumes are seperated by type.
7. Most volumes appear to be broken out over two physical disks. I am
not sure how many spindles each disk has. They are mostly 9GB and 18GB
FC drives.
8. tsm: I02SV1000>q db f=d
Available Space (MB): 83,520
Assigned Capacity (MB): 83,520
Maximum Extension (MB): 0
Maximum Reduction (MB): 14,832
Page Size (bytes): 4,096
Total Usable Pages: 21,381,120
Used Pages: 8,748,928
Pct Util: 40.9
Max. Pct Util: 41.2
Physical Volumes: 20
Buffer Pool Pages: 393,216
Total Buffer Requests: 10,118,152
Cache Hit Pct.: 96.81
Cache Wait Pct.: 0.00
9. Disk caching is being used as far as I can tell, though I don't know
exactly what to look for on Solaris. The buff pool in TSM is set to:
BufPoolSize: 1,572,864 K
Michael French
Savvis Communications
IDS01 Santa Clara, CA
(408)450-7812 -- desk
(408)239-9913 -- mobile
-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Dave Canan
Sent: Tuesday, November 04, 2003 6:24 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: Help with TSM server being hammered by clients
Can you provide a little more information on the layout of the disk
subsystem for the TSM DB, recovery log and storage pools? I recently
close another customer PMR that was very similar to this. After redoing
the layout for optimal performance for the DB and LOG, the problem went
away. For example:
1. What is the disk subsystem?
2. How many DBVOLS? LOGVOLS, STGPOOL VOLS?
3. Filesystem format - JFS/RLV/???
4. Is it RAID? What type?
5. Are the TSM volumes sharing space with other data?
6. Are the TSM volumes separated by type?
7. How many physical spindles do you have for the TSM volumes?
8. What is your cache hit %?
9. Are you using disk caching? (On both the client and server)
At 01:51 PM 11/4/2003 -0600, you wrote:
> TSM Server 4.2.4.1 (Solaris)
> TSM Client 4.2.3 (Solaris)
>
> TSM DB 83GB (40% util)
> TSM Log 4.6GB
>
> I am having a serious problem with 4 Solaris clients hammering
> the server during their backups. Each client has a lot of file held
> on TSM, about 4-5 million per node and growing, though not much data,
> only a couple of hundred GB's per node. The server is very responsive
> when these clients are not backing up, all other backups run without
> lagging the server. I have about 120 nodes backing up to this server
daily.
> What could be causing this performance problem? Here is a
> show logpin during last nights backup:
>
>tsm: I02SV1000>show logpin
>Dirty page Lsn=4675033.188.3116, Last DB backup Lsn=4677956.167.3489,
>Transaction table Lsn=4677883.231.3853, Running DB backup Lsn=0.0.0,
>Log truncation Lsn=4675033.188.3116
>Lsn=4675033.188.3116, Owner=DB, Length=128
>Type=Update, Flags=C2, Action=ExtDelete, Page=6110475, Tsn=0:180594521,
>PrevLsn=4675033.180.2739,
>UndoNextLsn=0.0.0, UpdtLsn=4675033.176.827 ===> ObjName=AF.Bitfiles,
>Index=12, RootAddr=29,
>PartKeyLen=1, NonPartKeyLen=7, DataLen=20
>The recovery log is pinned by a dirty page in the data base buffer
pool.
>Check the buffer pool
>statistics. If the associated transaction is still active then more
>information will be displayed
>about that transaction.
>Database buffer pool global variables:
>CkptId=25232, NumClean=269056, MinClean=393192, NumTempClean=393216,
>MinTempClean=196596,
>BufPoolSize=393216, BufDescCount=432537, BufDescMaxFree=432537,
>DpTableSize=393216, DpCount=124149, DpDirty=124149, DpCkptId=21890,
>DpCursor=92805,
>NumEmergency=0 CumEmergency=0, MaxEmergency=0.
>BuffersXlatched=0, xLatchesStopped=False, FullFlushWaiting=False.
>
>Is the large number of DpDirty pages bad? I think so, but I don't know
>the techincal details behind this value. The log is at 0% util when
>backups start during the evening and by midnight last night, the log
>was up to 80% and climbing rapidly. Once I cancel these 4 clients from
>backing up, the log stops filling so rapidly. Does anyone else have
>problems with clients that have large numbers of small files? How do
>you handle backing them up? It seems like these nodes take 8-10 hours
>a piece which seems very slow.
>
>Thanks in advance for any assistance that you can provide!
>
>Michael French
>Savvis Communications
>IDS01 Santa Clara, CA
>(408)450-7812 -- desk
>(408)239-9913 -- mobile
>
Dave Canan
TSM Performance
IBM Advanced Technical Support
ddcanan AT us.ibm DOT com
|