Resource utilization, max nump, mmbackup

lipi

ADSM.ORG Member
Joined
Jan 14, 2015
Messages
46
Reaction score
0
Points
0
Hi,

I have 23 LTO-4 drives attached to my TSM server and a filesystem with 500TiB used storage under GPFS. I have also 2 servers dedicated to backup data. All 3 servers are connected through 10Gbps links (1 for each).
In TSM server I have local raid 6 disk storage, giving 6Gbps or less (main bottleneck). For the part of the drives I get 1.4Gbps per drive (here bottleneck is drive, not network).

I am using mmbackup for the first time, and I want to perform a backup of the full filesystem. I tried different parameters but during execution I get the following errors:

a) 03/21/16 11:29:42 ANS1311E Server out of data storage space
b) 03/22/16 08:05:23 ANS0326E This node has exceeded its maximum number of mount points.

My main bottleneck is the disk. It is a storagepool of 12TiB that, when it's full, I receive the ANS1311E. I always have a migration process running. I see that when this happen, maxnump (16 for my client) is reached and data begins to go directly to drive (this is a good thing).

For a), I suspect that given a certain point where all sessions are attached directly to a drive, if one session finishes and the migration process managed to do some migration, e.g disk stg pool is at 99% and not 100%, a new session is raised and begins to write in the disk stgpool, receiving an "ANS1311E" after some minutes.

For b), I don't know why I get it.

Tried:
Total dsmc threads = 4, RESOURCEUTILIZATION=10, nummp=16,
Total dsmc threads = 8, RESOURCEUTILIZATION=6, nummp=16, 27 sessions seen (current)
Total dsmc threads = 12, RESOURCEUTILIZATION=10,nummp=16
Total dsmc threads = 24, RESOURCEUTILIZATION=10, nummp=12, 125 sessions seen (was first try)

maxsessions=255
maxnummp for client=16

In current try, I assumed that 8th*2 consumer resources = max. 16 mountpoints used, but I still see message b).


If i reduce number of dsmc threads, I lose performance. What I want is all to go directly to 16 drives and skip disk if possible.

Or what is your best approach in this case to maximize performance while avoiding errors? Now I get between 6 and 8 Gbps (pretty good).
 
There isn't any information about how many nodes you have, and trying to backup at the same time.

For the ANS0326E error, set the resourceutilization to 10, maximum dsmc threads to 2 and nummp to 20. If you fire up two dsmc sessions, each will take 8 drives and you should not see the ANS0326E message.

However, I would not approach the solution for faster backup this way. Rather, I would optimize the disk pool, dump the backup data there and offload to the tape drives setting the migration process number as high as possible (not to equal the total tape drive count).

The migration trigger point should be set at a value such that the utilization level of the disk pool can be maintained more or less constantly as data is ingested and written out to tape.

One good practice is to divide the disk pool into smaller chunks, assign these to the different storage pools which in turn is defined under the various Domains that you have. Thus, if you have four Domains like Windows, SQL, UNIX, and Oracle, divide the disk pool by four and assigning to each Domain's storage pool.
 
However, I would not approach the solution for faster backup this way. Rather, I would optimize the disk pool, dump the backup data there.
That's where I would put my focus too. Speed up the backup to disk.

One good practice is to divide the disk pool into smaller chunks, assign these to the different storage pools which in turn is defined under the various Domains that you have. Thus, if you have four Domains like Windows, SQL, UNIX, and Oracle, divide the disk pool by four and assigning to each Domain's storage pool.
Or if he decides to just use one large disk pool, have it spread across as many filesystems as possible on fast disks. So does offer the similar benefits to many storage pools performance wise due to having more disks spinning.
 
Thank you for your answers.

I must clarify that I have one strong constraint a.k.a not enough resources (money): I cannot increase stg pool disk performance, I have what I have. It does not depend on me.

Regarding your comments:

There isn't any information about how many nodes you have, and trying to backup at the same time.

In the description of the enviroment I say: I have also 2 servers dedicated to backup data. All 3 servers are connected through 10Gbps links (1 for each).
So: 1 TSM server, 2 nodes acting as TSM clients (configured under 1 node using proxy feature).

For the ANS0326E error, set the resourceutilization to 10, maximum dsmc threads to 2 and nummp to 20. If you fire up two dsmc sessions, each will take 8 drives and you should not see the ANS0326E message.

Maybe I have a misconception between threads and dsmc processes, but the option that I can specify in mmbackup is [--backup-threads BackupThreads], which gives me two dsmc processes. With this line I get 4 dsmc processes in each client, but also 22-4=18 threads:

/usr/lpp/mmfs/bin/mmbackup /dev/gpfs_projects --tsm-servers tsmserver -L 2 -N dbackup1,dbackup3 --quote -a 16 --expire-threads 24 --backup-threads 4 -n 24 -g /gpfs/projects/.bsc_mmbackup -t full"

During backup fase, thread nº:
dbackup3:/opt/mmbackup # ps -eLf|grep dsmc|grep -v grep|wc -l
22
dbackup3:/opt/mmbackup # ps aux|grep dsmc|grep -v grep | wc -l
4
 
I don't think you can achieve a high throughput without increasing your disk pool.

If you cannot increase your disk pool size, then you must lower your dsmc threads.
 
I agree with Moon-buddy on this; you can't squeeze blood out of a stone.
One little tweak that may help a bit: you use your disk solely as a short term staging area. Going with RAID-5 will give you more disk space a slightly better performance. Use multiple file systems so that TSM can spread the load and use more write queues.
 
Anyway, I got 56TiB/day when disk stgpool is available.

When disk gets full and clients start writting directly to drives, I get 14TiB/day. Here there is a problem, because I have 4 threads per backup node, and I get only 1Gbps per node (with 2 nodes, 2Gbps in total to TSM server), when I must be able to get up to 10Gbps to TSM server.

I inspected dsmc processes with strace -p and I see that only one process in each node is writting files, while the others are waiting (futex syscall).

This means that only two tape drives are used at a time, one for each backup node, despite there are 16 drives mounted and 8 dsmc processes.

Resource utilization is 6 in each node and sessions in TSM are 16.


Any hints on where to look and understand why only two dsmc processes are waiting (one in each node) while others are in waiting state?


tsm: TSM2>q mount f=d
ANR8488I LTO volume 009432 is mounted R/W in drive LTO4_0_1_1_8 (/dev/sl8500/by-bay/drv_46) -- owning server: TSM2, status: IN USE (session: 0, process: 2).
ANR8488I LTO volume 002534 is mounted R/W in drive LTO4_0_3_1_4 (/dev/sl8500/by-bay/drv_15) -- owning server: TSM2, status: IN USE (session: 568463, process: 0).
ANR8489I LTO volume 002926 is mounted R/W in drive LTO4_0_1_1_12 (/dev/sl8500/by-bay/drv_45) -- owning server: TSM2, status: IDLE (session: 0, process: 0).
ANR8488I LTO volume 005813 is mounted R/W in drive LTO4_0_2_1_11 (/dev/sl8500/by-bay/drv_18) -- owning server: TSM2, status: IN USE (session: 568459, process: 0).
ANR8488I LTO volume 009435 is mounted R/W in drive LTO4_0_0_1_11 (/dev/sl8500/by-bay/drv_50) -- owning server: TSM2, status: IN USE (session: 566744, process: 0).
ANR8488I LTO volume 009466 is mounted R/W in drive LTO4_0_1_1_11 (/dev/sl8500/by-bay/drv_34) -- owning server: TSM2, status: IN USE (session: 568532, process: 0).
ANR8488I LTO volume 009433 is mounted R/W in drive LTO4_0_3_1_11 (/dev/sl8500/by-bay/drv_02) -- owning server: TSM2, status: IN USE (session: 568601, process: 0).
ANR8488I LTO volume 002236 is mounted R/W in drive LTO4_0_1_1_7 (/dev/sl8500/by-bay/drv_35) -- owning server: TSM2, status: IN USE (session: 568560, process: 0).
ANR8487I Mount point in device class LTO4C_CLASS is waiting for the volume mount to complete -- owning server: TSM2, status: WAITING FOR VOLUME (session: 568439, process: 0).
ANR8488I LTO volume 009437 is mounted R/W in drive LTO4_0_2_1_0 (/dev/sl8500/by-bay/drv_32) -- owning server: TSM2, status: IN USE (session: 568009, process: 0).
ANR8489I LTO volume 003741 is mounted R/W in drive LTO4_0_0_1_4 (/dev/sl8500/by-bay/drv_63) -- owning server: TSM2, status: IDLE (session: 0, process: 0).
ANR8488I LTO volume 002141 is mounted R/W in drive LTO4_0_3_1_7 (/dev/sl8500/by-bay/drv_03) -- owning server: TSM2, status: IN USE (session: 567996, process: 0).
ANR8489I LTO volume 003126 is mounted R/W in drive LTO4_0_1_1_4 (/dev/sl8500/by-bay/drv_47) -- owning server: TSM2, status: IDLE (session: 0, process: 0).
ANR8488I LTO volume 009436 is mounted R/W in drive LTO4_0_3_1_8 (/dev/sl8500/by-bay/drv_14) -- owning server: TSM2, status: IN USE (session: 568507, process: 0).
ANR8488I LTO volume 009439 is mounted R/W in drive LTO4_0_3_1_0 (/dev/sl8500/by-bay/drv_16) -- owning server: TSM2, status: IN USE (session: 566990, process: 0).
ANR8489I LTO volume 002662 is mounted R/W in drive LTO4_0_3_1_12 (/dev/sl8500/by-bay/drv_13) -- owning server: TSM2, status: IDLE (session: 0, process: 0).
ANR8488I LTO volume 009465 is mounted R/W in drive LTO4_0_2_1_8 (/dev/sl8500/by-bay/drv_30) -- owning server: TSM2, status: IN USE (session: 567962, process: 0).
ANR8334I 17 matches found.


[TSM02][root@tsm02 admin]# dsmadmc -id=admin -pa=***** -tab q sess
IBM Tivoli Storage Manager
Command Line Administrative Interface - Version 7, Release 1, Level 1.0
(c) Copyright by IBM Corporation and other(s) 1990, 2014. All Rights Reserved.

Session established with server TSM2: Linux/x86_64
Server Version 7, Release 1, Level 5.0
Server date/time: 01/04/16 13:11:51 Last access: 01/04/16 13:08:23

ANS8000I Server command: 'q sess'.
4 Tcp/Ip IdleW 2 S 763,2 K 407,8 M Admin Linux/x86_64 TSM1
6 Tcp/Ip IdleW 4 S 759,9 K 1,4 G Admin Linux/x86_64 TSM3
14,212 Tcp/Ip IdleW 212,8 H 466,7 M 135,1 K Admin DSMAPI IBM-OC-TSM2
14,486 Tcp/Ip IdleW 3 S 726,5 K 356,9 M Admin Linux/x86_64 MASTER
266,697 Tcp/Ip Run 0 S 172,5 M 320,6 K Admin DSMAPI IBM-OC-TSM2
507,869 Tcp/Ip IdleW 3 S 20,3 M 3,7 M Admin DSMAPI IBM-OC-TSM2
535,020 Tcp/Ip IdleW 3 S 8,0 M 1,4 M Admin DSMAPI IBM-OC-TSM2
566,623 Tcp/Ip Run 0 S 450,4 K 477,5 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
566,639 Tcp/Ip Run 0 S 1,4 K 458,6 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
566,990 Tcp/Ip Run 0 S 20,5 K 14,6 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
567,969 Tcp/Ip IdleW 4 S 69,3 K 45,3 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
567,980 Tcp/Ip Run 0 S 5,6 K 18,7 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
567,996 Tcp/Ip Run 0 S 6,8 K 15,3 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,009 Tcp/Ip Run 0 S 6,4 K 15,4 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,438 Tcp/Ip Run 0 S 286,8 K 142,4 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,439 Tcp/Ip Run 0 S 2,6 K 3,3 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,459 Tcp/Ip Run 0 S 2,6 K 4,5 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,463 Tcp/Ip Run 0 S 3,5 K 13,6 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,525 Tcp/Ip Run 0 S 119,5 K 115,0 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,532 Tcp/Ip Run 0 S 4,8 K 5,3 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,557 Tcp/Ip Run 0 S 1,5 K 1,5 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,560 Tcp/Ip Run 0 S 4,3 K 4,1 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,764 Tcp/Ip IdleW 3,8 M 171 190 Admin DSMAPI IBM-OC-TSM2
568,768 Tcp/Ip IdleW 3,7 M 171 190 Admin DSMAPI IBM-OC-TSM2
568,796 Tcp/Ip IdleW 3,3 M 171 190 Admin DSMAPI IBM-OC-TSM2
568,815 Tcp/Ip IdleW 2,7 M 171 190 Admin DSMAPI IBM-OC-TSM2
568,836 Tcp/Ip Run 0 S 92,0 K 57,2 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,841 Tcp/Ip IdleW 0 S 1,3 K 1,5 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,858 Tcp/Ip Run 0 S 1,2 K 1,1 G Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,869 Tcp/Ip IdleW 1 S 32,0 K 14,0 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,870 Tcp/Ip Run 0 S 1,2 K 923,9 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,873 Tcp/Ip Run 0 S 42,7 K 20,9 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,876 Tcp/Ip IdleW 1 S 1,2 K 1,011,3 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,891 Tcp/Ip RecvW 0 S 1,1 K 358,7 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,894 Tcp/Ip IdleW 1 S 1,1 K 140,8 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,895 Tcp/Ip IdleW 0 S 975 457,9 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,899 Tcp/Ip Run 0 S 1,0 K 216,4 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,910 Tcp/Ip Run 0 S 1,007 719,7 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
568,925 Tcp/Ip Run 0 S 29,5 K 15,6 K Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,940 Tcp/Ip Run 0 S 875 119,2 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,965 Tcp/Ip Run 0 S 779 58,8 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,972 Tcp/Ip Run 0 S 779 24,5 M Node Linux x86-64 STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
568,987 Tcp/Ip IdleW 3 S 171 190 Admin DSMAPI IBM-OC-TSM2
568,988 Tcp/Ip Run 0 S 163 200 Admin Linux x86-64 ADMIN

ANS8002I Highest return code was 0.
 
When disk gets full and clients start writting directly to drives, I get 14TiB/day. Here there is a problem, because I have 4 threads per backup node, and I get only 1Gbps per node (with 2 nodes, 2Gbps in total to TSM server), when I must be able to get up to 10Gbps to TSM server.

The real solution is to do the whole backup to disk. Once the disk pool gets full, backups are redirected to tape, but you also have automatic migration that kicks off because the himig is reached.

It might be as simple as running migration before that backup or it may mean increasing the size of the diskpool. Or possibly a combination.

I inspected dsmc processes with strace -p and I see that only one process in each node is writting files, while the others are waiting (futex syscall).

This means that only two tape drives are used at a time, one for each backup node, despite there are 16 drives mounted and 8 dsmc processes.

Resource utilization is 6 in each node and sessions in TSM are 16.
For your tape mount issue, what's the MAXNUMMP for these nodes?
Did you check Q MOUNT during the backup to see if other processes had the drives reserved?
How many producer and consumer sessions did you have when going to disk? That number of sessions is not going to increase when the disk gets full and overflows to tape.
Resourceutil 6 should give you 2 producers and 2 consumers, is that what you see?

Any hints on where to look and understand why only two dsmc processes are waiting (one in each node) while others are in waiting state?
MediaW? IdleW? RecvW?
 
I see in the output that it's IDLEW, which means that the TSM Server is waiting for the client. So either the client has no work to do, or it's gathering data to send, but at this point, it's not sending anything, it's idle. That's not a configuration issue, I don't know mmbackup, so I can't tell you why it may be idle.

If you use "q sess f=d" instead, you will see which one has a volume, which one is waiting for one in the "Media Access Status" column.
 
I am backing up a filesystem of 1PiB, I cannot have 1PiB more of disk (neither 500TiB) only for backup.

a) MAXNUMMP for Proxy node is: Maximum Mount Points Allowed: 16

b) For the Proxynode Agent (it doesn't matter I guess) : STG_GPFS_PROJECTS1, STG_GPFS_PROJECTS3: Maximum Mount Points Allowed: 4

c) q mount = The only processess that have reserved tape drives are the ones of my backup, except for one migration process.

d) Same sessions where seen when going to disk, I don't have the consumer/producer numbers here.

e) Maybe this is the problem, with a resourceutilization of 6 I get max 4 sessions: 2 consumer/2 producer, on EACH node. It's what I see on "ps aux|grep dsmc"

More things, I see that with a query threads:
...
Thread 88, Parent 71: SdRefCountUpdateThread, Storage 0, AllocCnt 0 HighWaterAmt 0
tid=140737035020032, ptid=140737070278400, det=0, zomb=0, join=0, result=0, sess=0, procToken=0, sessToken=0
lwp=32316
Awaiting cond newQueue->notEmpty (0x0x7ffff03ce830), using mutex newQueue->mutex (0x0x1f06818), at queue.c(1218)

Thread 89, Parent 71: SdRefCountUpdateThread, Storage 0, AllocCnt 0 HighWaterAmt 0
tid=140737033967360, ptid=140737070278400, det=0, zomb=0, join=0, result=0, sess=0, procToken=0, sessToken=0
lwp=32317
Awaiting cond newQueue->notEmpty (0x0x7ffff03ce830), using mutex newQueue->mutex (0x0x1f06818), at queue.c(1218)

Thread 90, Parent 71: SdRefCountUpdateThread, Storage 0, AllocCnt 0 HighWaterAmt 0
tid=140737032914688, ptid=140737070278400, det=0, zomb=0, join=0, result=0, sess=0, procToken=0, sessToken=0
lwp=32318
Awaiting cond newQueue->notEmpty (0x0x7ffff03ce830), using mutex newQueue->mutex (0x0x1f06818), at queue.c(1218)
........

Maybe the threads are waiting for disk to be empty for some operation and doesn't continue automatically to tape? ... it's strange because I have all the drives mounted.



Will take a look to mediaw, etc.
 
I see in the output that it's IDLEW, which means that the TSM Server is waiting for the client. So either the client has no work to do, or it's gathering data to send, but at this point, it's not sending anything, it's idle. That's not a configuration issue, I don't know mmbackup, so I can't tell you why it may be idle.

If you use "q sess f=d" instead, you will see which one has a volume, which one is waiting for one in the "Media Access Status" column.


More on that confirms me that some processes are waiting for disk, others are writing to tape.. The client is doing nothing... dsmc processes are futex waiting. The IDLEW that you comment are only a few.


mmbackup is a wrapper for a dsmc selective:
dbackup1:/opt/tivoli/tsm/client/ba/bin # ps aux|grep dsmc
root 18766 1.8 0.0 290772 34476 ? S<l 13:22 0:17 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.2465.gpfs_projects -servername=TSM2
root 22767 2.2 0.0 288896 36816 ? S<l 13:30 0:10 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.2467.gpfs_projects -servername=TSM2
root 24820 2.0 0.0 294604 35540 ? S<l 13:36 0:02 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.2468.gpfs_projects -servername=TSM2
root 25060 3.7 0.0 293500 34528 ? S<l 13:37 0:02 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.2469.gpfs_projects -servername=TSM2




fragment of q session f=d
..............
Sess Number: 569,727
Comm. Method: Tcp/Ip
Sess State: Run
Wait Time: 0 S
Bytes Sent: 5,2 K
Bytes Recvd: 4,0 G
Sess Type: Node
more... (<ENTER> to continue, 'C' to cancel)

Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
Media Access Status: Current output volumes: STGPOOL_SEQ_LTO4_MN_PROJECTS,009439,(47 Seconds)
User Name:
Date/Time First Data Sent: 01/04/16 13:22:41
Proxy By Storage Agent:
Actions: BkIns
Failover Mode: No

Sess Number: 569,734
Comm. Method: Tcp/Ip
Sess State: IdleW
Wait Time: 10 S
Bytes Sent: 156,8 K
Bytes Recvd: 77,1 K
Sess Type: Node
Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
Media Access Status:
User Name:
Date/Time First Data Sent:
Proxy By Storage Agent:
Actions: FSUpd
Failover Mode: No

Sess Number: 569,773
Comm. Method: Tcp/Ip
Sess State: Run
Wait Time: 0 S
Bytes Sent: 4,7 K
Bytes Recvd: 3,8 G
Sess Type: Node
Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
Media Access Status:
User Name:
Date/Time First Data Sent: 01/04/16 13:23:52
Proxy By Storage Agent:
Actions: BkIns
Failover Mode: No

Sess Number: 569,790
Comm. Method: Tcp/Ip
Sess State: Run
Wait Time: 0 S
Bytes Sent: 4,2 K
Bytes Recvd: 4,6 G
Sess Type: Node
Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS1)
more... (<ENTER> to continue, 'C' to cancel)

Media Access Status:
User Name:
Date/Time First Data Sent: 01/04/16 13:24:00
Proxy By Storage Agent:
Actions: BkIns
Failover Mode: No

Sess Number: 569,791
Comm. Method: Tcp/Ip
Sess State: Run
Wait Time: 0 S
Bytes Sent: 831
Bytes Recvd: 191,8 M
Sess Type: Node
Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
Media Access Status: Current output volumes: STGPOOL_SEQ_LTO4_MN_PROJECTS,002926,(732 Seconds)
User Name:
Date/Time First Data Sent: 01/04/16 13:24:00
Proxy By Storage Agent:
Actions: BkIns
Failover Mode: No

Sess Number: 569,810
Comm. Method: Tcp/Ip
Sess State: Run
Wait Time: 0 S
Bytes Sent: 4,7 K
Bytes Recvd: 4,3 G
Sess Type: Node
Platform: Linux x86-64
Client Name: STG_PROJECTS_PROXY (STG_GPFS_PROJECTS3)
Media Access Status: Current output volumes: STGPOOL_SEQ_LTO4_MN_PROJECTS,009466,(45 Seconds)
User Name:
Date/Time First Data Sent: 01/04/16 13:24:12
Proxy By Storage Agent:
Actions: BkIns
Failover Mode: No
..............
 
I don't see any wait in Q SESS other than the few in IdleW, and that's only for a couple of seconds, so nothing alarming there on the IdleW.

What kind of throughput do you get writing normally for the other client backups, disk migration, backup stgpool?

Other than that, you might need to do some performance analysis: http://www.ibm.com/support/knowledg...m.itsm.perf.doc/r_instrumentation_server.html

I don't have the numbers right now but I remember to have seen good performance for drives and tapes.

I did a basic instrumentation ..

Code:
Instrumentation ended.
Instrumentation began 14:00:13.405 ended 14:01:05.629 elapsed 52.224
....

TOTAL SERVER SUMMARY
Operation  Count  Tottime  Avgtime  Maxtime InstTput RealTput  Total KB
----------------------------------------------------------------------------
Disk Read  21904  11.261  0.001  0.332 490922.1 105857.0  5528320
Disk Write  23820  17.245  0.001  1.153 344965.2 113911.7  5948968
Disk Commit  181  0.443  0.002  0.047
Tape Write  36188  55.945  0.002  3.266 165593.2 177390.8  9264128
Tape Commit  14  32.589  2.328  3.638
Tape Misc  14  0.036  0.003  0.002
Network Recv  1115229  500.236  0.000  36.097  19325.2 185107.9  9667147
Network Send  5610  0.071  0.000  0.000  5170.1  7.0  368
DB2 Fetch Prep  1985  0.384  0.000  0.111
DB2 MFtch Prep  547  0.213  0.000  0.046
DB2 Inser Prep  20  0.198  0.010  0.060
DB2 Delet Prep  4  0.007  0.002  0.003
DB2 Updat Prep  1232  0.234  0.000  0.033
DB2 Fetch Exec  7846  1.145  0.000  0.057
DB2 MFtch Exec  3331  1.022  0.000  0.045
DB2 Inser Exec  3145  9.665  0.003  1.425
DB2 Delet Exec  1447  0.770  0.001  0.040
DB2 Updat Exec  2987  0.902  0.000  0.452
DB2 Fetch  8158  0.038  0.000  0.000
DB2 MFetch  4440  0.033  0.000  0.000
DB2 CR Prep  5  0.012  0.003  0.003
DB2 CR Exec  312  0.033  0.000  0.000
DB2 Commit  5125  5.243  0.001  0.287
DB2 Reg Prep  779  0.225  0.000  0.061
DB2 Reg Exec  12415  3.844  0.000  0.086
DB2 Reg Fetch  6924  0.023  0.000  0.000
DB2 Connect  13  0.025  0.002  0.002
Tm Lock Wait  809  798.793  0.987  15.441
Acquire Latch  62  0.000  0.000  0.000
Acquire XLatch  427  0.000  0.000  0.000
Sleep  82  521.277  6.357  46.896
Thread Wait  286333 1763.244  0.006  42.267

Instrumentation output complete.

==============================================================================
DB2 STATISTICS DURING INSTRUMENTATION INTERVAL:

Note that items marked (*) will be zero unless DB2
monitor switches (SORT and LOCK) are enabled.
------------------------------------------------------------------------------
  Deadlocks detected:  0 -->  0.0/sec
  Number of lock escalations:  0 -->  0.0/sec
  Lock waits:  0 -->  0.0/sec
  Time waited on locks(*):  0.000 sec
  Locks held:  176 before, 173 after
 Intern Rollbacks Due To Dlock:  0 -->  0.0/sec
  Total sorts:  1668 -->  31.9/sec, 0.000 sec/sort
  Total sort time(*):  0 -->  0.0/sec
  Sort overflows:  0 -->  0.0/sec
  Direct reads from database:  994 -->  19.0/sec, 0.002 sec/read
  Direct read time:  1.516
  Direct writes to database:  0 -->  0.0/sec
  Direct write time:  0.000
  Number of Log Pages Written:  5641 -->  108.0/sec, 0.0013 sec latency
  Log Write Time:  7.398 sec
  Number of Log Writes:  3147 -->  60.3/sec
  Number of full log buffers:  0 -->  0.0/sec
  Bufferpool no victim buffers:  0 -->  0.0/sec
  Internal auto rebinds:  0 -->  0.0/sec
  Hash join overflows:  0 -->  0.0/sec
  Internal rows deleted:  0 -->  0.0/sec
  Internal rows inserted:  0 -->  0.0/sec
  Internal rows updated:  0 -->  0.0/sec
  Rows read:  8623825 -->  165130.2/sec
  Rows selected and returned:  17308 -->  331.4/sec
  Rows deleted:  2315 -->  44.3/sec
  Rows inserted:  13752 -->  263.3/sec
  Rows updated:  3496 -->  66.9/sec
  Select statments:  18109 -->  346.8/sec
  Upd/ins/del statements:  19303 -->  369.6/sec
  Number of commits:  5140 -->  98.4/sec
  Number of rollbacks:  42 -->  0.8/sec
 

Attachments

  • inst010416.txt
    84 KB · Views: 0
And attached file is the show threads command full output. You can see also the migrate processess currently running.

ANS8000I Server command: 'q proc'.

Process Process Description Process Status
Number
-------- -------------------- -------------------------------------------------
2 Migration Disk Storage Pool STGPOOL_DSK_MN_PROJECTS, Moved
Files: 36768903, Moved Bytes: 74,519 GB,
Deduplicated Bytes: 0 bytes, Unreadable Files:
0, Unreadable Bytes: 0 bytes. Current Physical
File (bytes): 655 MB Current output volume(s):
006007.

ANS8002I Highest return code was 0.

And q storage:
ANS8000I Server command: 'q stg'.
STGPOOL_DSK_MN_APPS DISK DEVCLASS 2,048 G 0,0 0,0 90 60 STGPOOL_SEQ_LTO4_MN_APPS
STGPOOL_DSK_MN_ARCHIVE DISK DEVCLASS 4,096 G 33,5 33,5 90 60 STGPOOL_SEQ_LTO4_MN_ARCHIVE
STGPOOL_DSK_MN_PROJECTS DISK DEVCLASS 14,336 G 100,0 100,0 90 60 STGPOOL_SEQ_LTO4_MN_PROJECTS
STGPOOL_SEQ_LTO4_MN_APPS LTO4C_CLASS DEVCLASS 108,386,511,505 G 0,0 0,0 90 70
STGPOOL_SEQ_LTO4_MN_ARCHIVE LTO4C_CLASS DEVCLASS 84,491,230,829 G 0,0 0,0 90 70
STGPOOL_SEQ_LTO4_MN_ARCH_COPY LTO4C_CLASS DEVCLASS 83,632,932,293 G 0,0
STGPOOL_SEQ_LTO4_MN_PROJECTS LTO4C_CLASS DEVCLASS 111,616,746,469 G 0,0 0,0 95 70
 

Attachments

  • shth.txt
    95.3 KB · Views: 0
Sorry, but looking at instrumentation tracing is quite time consuming, that's not something you're likely to get help from in a forum.

My recommendation, check: http://www.ibm.com/support/knowledg...erf.doc/c_instrumentation_server_threads.html and scroll down to: Backups for sequential-access storage pools to understand the various threads.

And check these too:
http://www.ibm.com/support/knowledg...rf.doc/r_instrumentation_client_examples.html
http://www.ibm.com/support/knowledg...ibm.itsm.perf.doc/t_ptg_bkup_rstore_diag.html
 
It seems to me you're getting way to deep and can no longer see the forest for the trees! Can we summarize the problem? Correct me if I'm wrong:
You have multiple sessions coming in.
You're disk space is less than a night's backup.
While backing up to disk performance is acceptable; to tape it is not.

Did you try something simple? Like setting the mig threshold to 50% or less with migproc nice and high?
I can't guarantee this as your backups are coming from only 3 proxies, so all of the backups belong to a single node. I haven't had to deal too much with migration lately, but you may only ever get 1 migration process running (since migration threads are per node). If that's the case you may need to put your threshold WAY down.
Use FILE device class for you pool to give you the best sequential access.

And I just remembered something else. As long as the migration is running, the FILE volumes you migrate will NOT free up until the migration process completes. I don't know if there's a similar problem with DISK pools. So either go with DISK pools, or force the migration process to end every 1/2 hour or so. (There's a couple of ways to do that.) As long as you're backing up relatively small files, that should work.
 
My two cents:

You cannot get the throughput you are aiming for even with all the tuning that you do or may do.

Here is the reason why:

The cache or disk pool should really be big and approaching the size of the anticipated data to be backed up. When you say you need to backup 1 PiB a day, you really need a big disk pool to hold this amount of data then migrate over to the tape drives.

An example with some simple calculations:

1. Ingest is 20 TiB/hour (assumed from the collective input or nodes)
2. Outbound to tape (assumed ideal) is 10 TiB/hour
3. Ratio is 1:2 which also the rate to 'empty' or write rate from disk to tape
4. Assume disk pool is 100 TiB and migration is set to 25% - meaning TSM will start to write to tape when the disk pool is a little over 25% full
5. With the ratio as determined from (1), (2) and (3), the ingest will outrun the outbound traffic in about 3 to 4 hours.

The sample above is banking on the 'ideal mix'. However, the truth of the matter is that you will never get the sustained 10 TiB/hour to the tape. The writing to tape depends on factors like file sizes, operational overheads like CRC checks, etc. More smaller files, the worst the cases will be.

In your case, the ante is even higher. You want 10 Gbps which translates to 35 TiB /hour. This means that you need to sustain writes to tapes uninterrupted. I don't think this is possible with a physical system, i.e., physical tape.

In other words, you need a bigger disk pool to backup your data while the back end write to tape is chugging along.

A VTL environment may solve your problem - dump to VTL (replacing the disk pool), then migrate to tape.

Or, try Dan's suggestion of using devclass=file, have the file sizes small like 400 GB and have a bigger disk pool.
 
Thank you for your suggestions.

@DanGiles: Your assumptions are right, but I think that the migration threshold will not help. I already have the stgpool at 99% and it is migrating since the second day. I also have migproc set but as you mention I only see 1 process.

Then, you mean that I must change DISK devclass to FILE devclass? What's the difference? My performance with disk stgpool is good until it is filled. The problem is writting to tape...


@moon-buddy: I need to backup 1PiB, but not in one day :) . Right now I am still backing up at 14TiB/day, that is very low performance. All data is going to tapes directly since stgpool disk is full and migration doesn't have time to empty it.


What I don't understand is that if I have 16 drives allocated for my client, and all drives are in USED state, I only get 1Gbps. Each LTO-4 tape should give 1Gbps, so 1x16=16Gbps at least. Ok, assuming it is not sequential, small files, crc, etc... I should get at least 6-8Gbps? .. I only get 14TiB/day.... I see moments where throughput is 5-6Gbps, but all the time it's sustained at 1-2Gbps.

I/O wait % of server and clients is 0 -> Maybe the problem is in dsmc selective , I don't know.
 
I see that:

Code:
dbackup1:~ # ps aux|grep dsm
root  23032  1.6  0.0 273996 35004 ?  S<l  09:58  0:13 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5670.gpfs_projects -servername=TSM2
root  25134  1.5  0.0 293704 34408 ?  S<l  10:03  0:08 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5671.gpfs_projects -servername=TSM2
root  25247  3.2  0.0 359188 34400 ?  S<l  10:07  0:10 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5672.gpfs_projects -servername=TSM2
root  25521  7.9  0.0 293636 34532 ?  S<l  10:09  0:13 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5673.gpfs_projects -servername=TSM2
root  27176  0.0  0.0  5524  856 pts/3  S+  10:12  0:00 grep dsm



dbackup1:~ # strace -p 25247
Process 25247 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 631, {1460448746, 179903000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 633, {1460448747, 180007000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 635, {1460448748, 180099000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 637, {1460448749, 180190000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 639, {1460448750, 180295000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cd30, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 641, {1460448751, 180401000}, ffffffff^C <unfinished ...>
Process 25247 detached



dbackup1:~ # strace -p 25521
Process 25521 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cea0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 369, {1460448757, 612906000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cea0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 371, {1460448758, 613067000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cea0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 373, {1460448759, 613277000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x100cea0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x10272a4, 0x189 /* FUTEX_??? */, 375, {1460448760, 613436000}, ffffffff^C <unfinished ...>
Process 25521 detached



dbackup1:~ # strace -p 25134
Process 25134 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1145, {1460448765, 532763000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "Normal File-->  43,143,237"..., 297) = 297
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1147, {1460448766, 533062000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "Normal File-->  8,084,094"..., 217) = 217
write(1, "Normal File-->  6,861"..., 211) = 211
write(1, "Directory-->  4,096"..., 196) = 196
write(1, "Directory-->  4,096"..., 180) = 180
write(1, "Normal File-->  14,008,682"..., 217) = 217
write(1, "Directory-->  4,096"..., 184) = 184
write(1, "Directory-->  4,096"..., 188) = 188
write(1, "Normal File-->  41,837,242"..., 217) = 217
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1149, {1460448767, 533748000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1151, {1460448768, 533915000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1153, {1460448769, 534021000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1155, {1460448770, 534145000}, ffffffff^C <unfinished ...>
Process 25134 detached
dbackup1:~ # strace -p 23032
Process 23032 attached - interrupt to quit
select(0, NULL, NULL, NULL, {0, 671060}) = 0 (Timeout)
getuid()  = 0
munmap(0x7f4876aeb000, 958371)  = 0
munmap(0x7f4876bd5000, 958371)  = 0
close(3)  = 0
munmap(0x7f4874c67000, 1329784)  = 0
munmap(0x7f487052e000, 2555120)  = 0
munmap(0x7f4874dac000, 1386096)  = 0
write(1, "ANS1804E Selective Backup proces"..., 1039) = 1039
exit_group(4)  = ?
Process 23032 detached



dbackup1:~ # strace -p 23032
attach: ptrace(PTRACE_ATTACH, ...): No such process



dbackup1:~ # ps aux|grep dsm
root  25134  1.5  0.0 293704 34460 ?  S<l  10:03  0:08 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5671.gpfs_projects -servername=TSM2
root  25247  3.2  0.0 359188 34408 ?  S<l  10:07  0:11 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5672.gpfs_projects -servername=TSM2
root  25521  6.6  0.0 293636 34744 ?  S<l  10:09  0:14 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5673.gpfs_projects -servername=TSM2
root  27823  1.4  0.0 271804 32728 ?  S<l  10:12  0:00 /usr/bin/dsmc selective -filelist=/gpfs/projects/.mmbackupCfg/mmbackupChanged.ix.12474.857F602C.5674.gpfs_projects -servername=TSM2
root  27863  0.0  0.0  5524  852 pts/3  S+  10:13  0:00 grep dsm



dbackup1:~ # strace -p 25134
Process 25134 attached - interrupt to quit
restart_syscall(<... resuming interrupted call ...>) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
futex(0x100cf64, 0x189 /* FUTEX_??? */, 1199, {1460448792, 537059000}, ffffffff) = -1 ETIMEDOUT (Connection timed out)
futex(0x10272f0, FUTEX_WAKE_PRIVATE, 1) = 0
write(1, "ANS1898I ***** Processed  1,5"..., 218) = 218
write(1, "Directory-->  4,096"..., 175) = 175
write(1, "Normal File-->  50,273"..., 151) = 151
write(1, "Directory-->  4,096"..., 172) = 172
write(1, "Directory-->  4,096"..., 176) = 176
write(1, "Directory-->  4,096"..., 176) = 176
write(1, "Directory-->  4,096"..., 180) = 180
write(1, "Directory-->  4,096"..., 140) = 140
write(1, "Normal File-->  8,282"..., 151) = 151
write(1, "Normal File-->  309"..., 159) = 159
write(1, "Normal File-->  384"..., 153) = 153
write(1, "Normal File-->  49,458"..., 151) = 151
write(1, "Normal File-->  7,996"..., 151) = 151
write(1, "Normal File-->  50,273"..., 179) = 179
write(1, "Normal File-->  8,380"..., 179) = 179
write(1, "Normal File-->  131,431"..., 181) = 181
write(1, "Directory-->  4,096"..., 183) = 183
write(1, "Directory-->  4,096"..., 183) = 183
write(1, "Directory-->  4,096"..., 187) = 187
write(1, "Normal File-->  816,004"..., 129) = 129
write(1, "Normal File-->  2,665"..., 218) = 218
 
Back
Top