Identify Duplicates process keeps running longer than set in duration

navion · May 30, 2014

Hello chaps.
I have configured TSM using tsmconfig.pl from the Blueprint, deduplication process should run from 21:00 to 9:00 within backup window:

Code:

q sched t=a
[TABLE]
[TR="class: cmdTableHeader"]
[TH]*[/TH]
[TH]Schedule Name[/TH]
[TH]Start Date/Time[/TH]
[TH]Duration[/TH]
[TH]Period[/TH]
[TH]Day[/TH]
[/TR]
[TR]
[TD][/TD]
[TD]DBBACKUP
[/TD]
[TD]2014-04-25, 10:00:00[/TD]
[TD]15 M [/TD]
[TD]1 D [/TD]
[TD]Any
[/TD]
[/TR]
[TR]
[TD][/TD]
[TD]DEDUPLICATE
[/TD]
[TD]2014-04-25, 21:00:00[/TD]
[TD]15 M [/TD]
[TD]1 D [/TD]
[TD]Any[/TD]
[/TR]
[TR]
[TD][/TD]
[TD]EXPIRE
[/TD]
[TD]2014-04-25, 07:00:00[/TD]
[TD]15 M [/TD]
[TD]1 D [/TD]
[TD]Any[/TD]
[/TR]
[TR]
[TD][/TD]
[TD]RECLAIM[/TD]
[TD]2014-04-25, 11:00:00[/TD]
[TD]15 M [/TD]
[TD]1 D [/TD]
[TD]Any[/TD]
[/TR]
[/TABLE]

q script deduplicate f=d[TABLE]
[TR="class: cmdTableHeader"]
[TH]Name
[/TH]
[TH]Line Number
[/TH]
[TH]Command[/TH]
[TH]Last Update by (administrator)[/TH]
[TH]Last Update Date/Time[/TH]
[/TR]
[TR]
[TD]DEDUPLICATE[/TD]
[TD]Description[/TD]
[TD]Run identify duplicate processes.[/TD]
[TD]ADMIN[/TD]
[TD]2014-04-25, 14:53:53[/TD]
[/TR]
[TR]
[TD]DEDUPLICATE
[/TD]
[TD]10[/TD]
[TD]identify duplicates DEDUPPOOL numprocess=12 duration=720[/TD]
[TD]ADMIN[/TD]
[TD]2014-05-30, 12:26:33[/TD]
[/TR]
[/TABLE]

At 13:00 EXPIRE, DBBACKUP and RECLAIM already completed, but all of the identify processes still running. Does anywone know how to restrict identify run time?

Please excuse my bad English, I hope you can understand what I want to say.

marclant · May 30, 2014

How many identify processes is your storage pool setup for? Those will run 24/7. If you want to control when to run it, then update the stgpool to have 0 identify processes, and keep your current schedule to start X number of processes as needed.

NUMPRocessSpecifies the number of duplicate-identification processes to run after the command completes. You can specify 0 - 50 processes. The value that you specify for this parameter overrides the value that you specified in the storage pool definition or the most recent value that was specified when you last issued this command. If you specify zero, all duplicate-identification processes stop.This parameter is optional. If you do not specify a value, the Tivoli Storage Manager server starts or stops duplicate-identification processes so that the number of processes is the same as the number that is specified in the storage pool definition.

For example, suppose that you define a new storage pool and specify two duplicate-identification processes. Later, you issue the IDENTIFY DUPLICATES command to increase the number of processes to four. When you issue the IDENTIFY DUPLICATES command again without specifying a value for the NUMPROCESS parameter, the server stops two duplicate-identification processes.

If you specified 0 processes when you defined the storage pool definition and you issue IDENTIFY DUPLICATES without specifying a value for NUMPROCESS, any running duplicate-identification processes stop, and the server does not start any new processes.

Remember: When you issue IDENTIFY DUPLICATES without specifying a value for NUMPROCESS, the DURATION parameter is not available. Duplicate-identification processes specified in the storage pool definition run indefinitely, or until you reissue the IDENTIFY DUPLICATES command, update the storage pool definition, or cancel a process.

When the server stops a duplicate-identification process, the process completes the current physical file and then stops. As a result, it might take several minutes to reach the number of duplicate-identification processes that you specified as a value for this parameter.

More info:
http://www-01.ibm.com/support/knowl...ef.doc/r_cmd_duplicates_identify.html?lang=en

navion · May 30, 2014

marclant said:
How many identify processes is your storage pool setup for?

All stgpool maintenance already disabled.

Code:

                    Storage Pool Name: DEDUPPOOL
                    Storage Pool Type: Primary
                    Device Class Name: FILEDEV
                   Estimated Capacity: 48,353.94 G
                   Space Trigger Util: 93.55
                             Pct Util: 46.247
                             Pct Migr: 46.247
                          Pct Logical: 99.364
                         High Mig Pct: 90
                          Low Mig Pct: 70
                      Migration Delay: 0
                   Migration Continue: Yes
                  Migration Processes: 1
                Reclamation Processes: 10
                    Next Storage Pool: 
                 Reclaim Storage Pool: 
               Maximum Size Threshold: No Limit
                               Access: Read/Write
                          Description: Deduplicated disk storage
                    Overflow Location: 
                Cache Migrated Files?: 
                           Collocate?: Group
                Reclamation Threshold: 100
            Offsite Reclamation Limit: 
      Maximum Scratch Volumes Allowed: 968
       Number of Scratch Volumes Used: 479
        Delay Period for Volume Reuse: 0 Day(s)
               Migration in Progress?: No
                 Amount Migrated (MB): 0
     Elapsed Migration Time (seconds): 0
             Reclamation in Progress?: No
       Last Update by (administrator): ADMIN
                Last Update Date/Time: 2014-04-29, 23:35:25
             Storage Pool Data Format: Native
                 Copy Storage Pool(s): 
                  Active Data Pool(s): 
              Continue Copy on Error?: Yes
                             CRC Data: No
                     Reclamation Type: Threshold
          Overwrite Data when Deleted: 
                    Deduplicate Data?: Yes
 Processes For Identifying Duplicates: 0
            Duplicate Data Not Stored: 240,528 M ( 1%)
                       Auto-copy Mode: Client
Contains Data Deduplicated by Client?: No

rallingham · May 31, 2014

One way to possibly get around this issue is to remove the Identify Duplicates from the storage pool control and set it up as an administrative scheduled with a limited time period for running. Give it a try and let us know how that goes.

navion · Jun 3, 2014

It is already done by the tsmconfig.pl script:

identify duplicates DEDUPPOOL numprocess=12 duration=720

navion · Jun 10, 2014

Looks like I have some issue with deduplication performance, because backup, migration or deletion of deduplicated storage pool took ages to complete.

Dunul · Apr 30, 2015

Did you mange to solve this one?
I am having the same issue.

marclant · Apr 30, 2015

Most of TSM processes when cancelled either because the duration has passed or by the cancel command will normally finish to process the current object. If that object is really large, then it can run several minutes longer than the duration.

You can see that often when you cancel a process, sometimes it takes several minutes before the process actually ends, same thing applies when the duration is reached.

Dunul · Apr 30, 2015

Hi, marcland, I am familiar with this.
In my case it is running for more than 7 hours!

marclant · Apr 30, 2015

7 hours is long. Are you at the latest fixpack for your current version, because there are a few APARs for dedup and identify? If not, should be the first step, why try to troubleshoot a solved problem.

Dunul · May 3, 2015

I have 7.1.1.100 TSM server (latest ver of 7.1.1).
What first step?

navion · May 13, 2015

Dunul said:
Did you mange to solve this one?
I am having the same issue.

Unfortunately not, I tried every fix pack with no success.

navion · Oct 6, 2015

Finally fixed running manual index reorganization:
http://www-01.ibm.com/support/docview.wss?uid=swg21452146#offline_table_reorg

Automatic reorganization was turned off in scripts from the Blueprint and I missed that moment.

Identify Duplicates process keeps running longer than set in duration

navion

marclant

navion

rallingham

navion

navion

Dunul

marclant

Dunul

marclant

Dunul

navion

navion

Data Privacy Impact Assessment

Sponsor ADSM.ORG

Navigation Menu

NordVPN 3 Months FREE

Forum statistics