Data retention and expiration

JensD

ADSM.ORG Senior Member
Joined
Aug 2, 2005
Messages
82
Reaction score
3
Points
0
Location
Denmark
Website
Visit site
A few years ago I started this thread asking for advice on how to perform out-of-sync full backups without affecting MS SQL LSN (and the normal SQL backups tanken with TDP for SQL).

I've now been running those monthly full backups for a little over a year where each month a new copy of the database is placed in the same file as the previous months copy.

Everything looked fine (and still does), and I get regular backups that I have veryfied on a few occations (firedrill and even once for real) and they are placed in the right managementclass.

However I'm seeing that my expiration settings of the copygroup is not doing what I expect it to do.

The copygroup looks like this:
Code:
  Policy Domain Name: STANDARD
  Policy Set Name: STANDARD
  Mgmt Class Name: MONTHLY_FULL_SQL
  Copy Group Name: STANDARD
  Copy Group Type: Backup
  Versions Data Exists: No Limit
  Versions Data Deleted: No Limit
  Retain Extra Versions: 378
  Retain Only Version: 378
  Copy Mode: Modified
  Copy Serialization: Shared Static
  Copy Frequency: 0
  Copy Destination: FULLFILESQL_SEQ_BU
  Table of Contents (TOC) Destination:
  Last Update by (administrator): ADMIN
  Last Update Date/Time: 10/16/2014 10:02:02
  Managing profile:
  Changes Pending: No

The 378 days equals 54 weeks after which I want the oldest inactive version of the file to be expired and removed.

Looking at the available files (output from cmdline with -pick and -inactive) I get this:
Code:
TSM Scrollable PICK Window - Restore

  #  Backup Date/Time  File Size A/I  File
  ------------------------------------------------------------------------------------
  1. | 03/01/2015 02:41:36  254.67 GB  A  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  2. | 02/02/2015 01:20:30  247.04 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  3. | 01/05/2015 01:24:26  238.49 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  4. | 12/20/2014 01:16:43  236.99 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  5. | 11/02/2014 01:08:42  217.50 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  6. | 10/05/2014 01:16:42  210.00 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  7. | 09/07/2014 02:18:04  204.45 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  8. | 08/03/2014 01:11:57  199.18 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
  9. | 07/06/2014 02:11:15  197.81 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
 10. | 06/01/2014 02:02:07  194.66 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
 11. | 05/04/2014 02:06:04  193.69 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
 12. | 04/06/2014 01:22:13  207.45 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
 13. | 03/04/2014 01:09:06  229.87 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-
 14. | 02/02/2014 01:40:11  225.50 GB  I  \\surveydb4\e$\FullSQLBackup\xact_HEAD-

Note that the full filename has been cut short - but it's the same filename for all items in the list.

The above output is from today March 11th 2015.

As I understand the settings in the copygroup the 14th file should have expired and have been removed a while back - since there are 402 days between February 2nd 2014 and today.

I'd also expect the 13th file to be removed 7 days from now - from March 4th 2014 to now we have 372 days.

What setting in the copygroup should I change for this to happen, but still retain all inactive versions of an existing file within the last 378 days?

I'm not interested in what happens when a file in this copygroup has been deleted form the client node - we handle archival by other means than this copygroup and managementclass.
 
You're config looks good. Is the expiration running to completion daily?
 
Hi

Yes, expire inventory runs daily and completes - here's the end of latest one from earlier this morning:
Code:
ANR2753I (DAGLIGE_RUTINER):ANR2577I Schedule DAILY-RUNNING defined. ~
ANR2017I Administrator ADMIN issued command: EXPIRE INVENTORY WAIT=YES ~
ANR0984I Process 17235 for EXPIRE INVENTORY started in the FOREGROUND at 04:45:04 AM.~
ANR0811I Inventory client file expiration started as process 17235.~
ANR0812I Inventory file expiration process 17235 completed: processed 19 nodes, examined 308122 objects, deleting 308122 backup objects, 0 archive objects, 0 DB backup volumes, and 0 recovery plan fils. 0 objects were retried and 0 errors were encountered.~
ANR2753I (DAGLIGE_RUTINER):ANR0167I Inventory file expiration~
ANR0987I Process 17235 for EXPIRE INVENTORY running in the FOREGROUND processed 308,122 items with a completion state of SUCCESS at 04:50:38 AM.~

A manual expire of the node where the data resides gives this:
Code:
ANR2017I Administrator ADMIN issued command: EXPIRE INVENTORY node=surveydb4_ba ~
ANR0984I Process 17266 for EXPIRE INVENTORY started in the BACKGROUND at 02:48:29 PM.~
ANR0811I Inventory client file expiration started as process 17266.~
ANR0167I Inventory file expiration process 17266 processed for 0 minutes.~
ANR0812I Inventory file expiration process 17266 completed: processed 1 nodes, examined 0 objects, deleting 0 backup objects, 0 archive objects, 0 DB backup volumes, and 0 recovery plan files. 0 objects were retried and 0 errors were encountered.~
ANR0985I Process 17266 for EXPIRE INVENTORY running in the BACKGROUND completed with completion state SUCCESS at 02:48:30 PM.~

However, when I look at what filespaces the daily expire looks at for the node something is off.

The node has these filespaces:
Code:
tsm: TSMSRV1.x.x.x.x>q filesp surveydb4_ba

Node Name  FSID  Filespace Name
--------------- --  -----------
SURVEYDB4_BA  1  SURVEYDB4\SystemState\NULL\SystemState\SystemState
SURVEYDB4_BA  2  \\surveydb4\e$
SURVEYDB4_BA  3  \\surveydb4\c$
SURVEYDB4_BA  4  \\surveydb4\d$

But when grepping (via $ tail -n 20000 events.log | grep SURVEYDB4_BA | grep filespace) for the filespaces processed at during the last expirey (the same completed above) I can only see the c$ and SystemState filespace mentioned:
Code:
ANR0165I Inventory file expiration started processing for node SURVEYDB4_BA, filespace \\surveydb4\c$, copygroup BACKUP and object type FILE.~
ANR0166I Inventory file expiration finished processing for node SURVEYDB4_BA, filespace \\surveydb4\c$, copygroup BACKUP and object type FILE with processing statistics: examined 24, deleted 24, retrying 0, and failed 0.~
ANR0165I Inventory file expiration started processing for node SURVEYDB4_BA, filespace \\surveydb4\c$, copygroup BACKUP and object type DIRECTORY.~
ANR0166I Inventory file expiration finished processing for node SURVEYDB4_BA, filespace \\surveydb4\c$, copygroup BACKUP and object type DIRECTORY with processing statistics: examined 4, deleted 4, retrying 0, and failed 0.~
ANR0165I Inventory file expiration started processing for node SURVEYDB4_BA, filespace SURVEYDB4\SystemState\NULL\System State\SystemState, copygroup BACKUP and object type GROUP BASE.~
ANR0166I Inventory file expiration finished processing for node SURVEYDB4_BA, filespace SURVEYDB4\SystemState\NULL\System State\SystemState, copygroup BACKUP and object type GROUP BASE with processing statistics: examined 84031, deleted 84031, retrying 0, and failed 0.~

If I extend the number of lines in the tail command I only see the previos days expire inventory that also only shows the same filespaces..
 
It's intriguing that it's not processing D$. Can you just try:
Code:
query actlog search='\\surveydb4\d$' begindate=-5
 
Yes, rather worrying now that I see it (the data I want expired resides on e$).

Heres the output requested - as well as the same search for e$:
Code:
tsm: TSMSRV1.x.x.x.x>query actlog search='\\surveydb4\d$' begindate=-5

Date/Time  Message
--------------------  ----------------------------------------------------------
03/11/2015 17:53:19  ANR2017I Administrator ADMIN issued command: QUERY ACTLOG
  search=\\surveydb4\d$ begindate=-5  (SESSION: 891116)

tsm: TSMSRV1.x.x.x.x>query actlog search='\\surveydb4\e$' begindate=-5

Date/Time  Message
--------------------  ----------------------------------------------------------
03/11/2015 17:53:33  ANR2017I Administrator ADMIN issued command: QUERY ACTLOG
  search=\\surveydb4\e$ begindate=-5  (SESSION: 891116)

I looked a bit further back, and found this restore of an inactive version of the file from e$ as part of a firedrill:
Code:
02/24/2015 14:08:41  ANR0504I Session 798110 for node SURVEYDB4_BA(Userid=),
  restored or retrieved Backup object: node SURVEYDB4_BA
  filespace \\surveydb4\e$, object
  \FULLSQLBACKUP\XACT_HEAD-COMPLETE-SQLAGENT.SQLBACKUP,
  version 13 of 14. (SESSION: 798110)

Looking at the clients logfile I find this:
Code:
Executing scheduled command now.
03/11/2015 01:00:05 Node Name: SURVEYDB4_BA
03/11/2015 01:00:05 Session established with server TSMSRV1.x.x.x.x: Linux/x86_64
03/11/2015 01:00:05  Server Version 6, Release 3, Level 2.200
03/11/2015 01:00:05  Server date/time: 03/11/2015 01:00:13  Last access: 03/11/2015 00:23:11

03/11/2015 01:00:05 --- SCHEDULEREC OBJECT BEGIN DAILY_INCR_BACK 03/11/2015 01:00:00
03/11/2015 01:00:05 Incremental backup of volume 'SYSTEMSTATE'
...
03/11/2015 01:09:31
SystemState Backup finished successfully.

03/11/2015 01:09:31 Incremental backup of volume '\\surveydb4\c$'
03/11/2015 01:09:31 Incremental backup of volume '\\surveydb4\d$'
03/11/2015 01:09:31 Incremental backup of volume '\\surveydb4\e$'
03/11/2015 01:09:31 ANS1898I ***** Processed  84,000 files *****
03/11/2015 01:09:42 Successful incremental backup of '\\surveydb4\e$'
...
03/11/2015 01:10:27 Successful incremental backup of '\\surveydb4\c$'
03/11/2015 01:10:29 Successful incremental backup of '\\surveydb4\d$'

03/11/2015 01:10:31 --- SCHEDULEREC STATUS BEGIN
...
03/11/2015 01:10:31 Elapsed processing time:  00:10:25
...
03/11/2015 01:10:31 --- SCHEDULEREC STATUS END
03/11/2015 01:10:31 --- SCHEDULEREC OBJECT END DAILY_INCR_BACK 03/11/2015 01:00:00
03/11/2015 01:10:31 Scheduled event 'DAILY_INCR_BACK' completed successfully.
03/11/2015 01:10:31 Sending results for scheduled event 'DAILY_INCR_BACK'.
03/11/2015 01:10:31 Results sent to server for scheduled event 'DAILY_INCR_BACK'.

The options file has "DOMAIN ALL-LOCAL" in it and according to the log everything looks fine from that side..
 
No one has any idea as to whats going on here?

We got 5 Windows-based BA clients and ~ 15 other unix-based and 1 TDP for SQL based node.
4 of those 5 Windows-based nodes has other drives than the C: system drive - Nodes DEVDB3_BA, SURVEYDB3_BA, SURVEYDB4_BA, SURVEYDB5_BA.
Only 1 of those 4 has logged expiration processing from other drives than C: - that's the DEVDB3_BA node.

The options files are almost exactly the same apart from a few local exclusions and includes, and the BA clients are almost the same version (see below).

If I query occupancy all 4 clients has data from other drives than the system drive stored in TSM - here formatted a bit with details for the E: drive across alle nodes:
Code:
tsm: TSMSRV1.x.x.x>q occ <node> <FSID> nametype=fsid

Node          Type  Filespace    FSID  STGP                 # files    Physical OCC   Logical OCC
----------    ----  ----------   ----  -------------------  --------   -------------  -----------
DEVDB3_BA     Bkup  \\devdb3\e$     4  DISK01_SEQ_BU        396        0.26           0.26
DEVDB3_BA     Bkup  \\devdb3\e$     4  DISK02_SEQ_BU        2          0.00           0.00
DEVDB3_BA     Bkup  \\devdb3\e$     4  FULLFILESQL_SEQ_BU   215        0.14           0.13
DEVDB3_BA     Bkup  \\devdb3\e$     4  TAPE02_OFFSITE       613        0.40           0.40
SURVEYDB3_BA  Bkup  \\surveydb3\e$  7  DISK01_SEQ_BU        9          0.01           0.01
SURVEYDB3_BA  Bkup  \\surveydb3\e$  7  TAPE02_OFFSITE       9          0.01           0.01
SURVEYDB4_BA  Bkup  \\surveydb4\e$  2  DISK01_SEQ_BU        83         25,437.82      25,437.82
SURVEYDB4_BA  Bkup  \\surveydb4\e$  2  FULLFILESQL_SEQ_BU   56         3,274,796.2    3,274,796.2
SURVEYDB4_BA  Bkup  \\surveydb4\e$  2  TAPE02_OFFSITE       139        3,300,234.0    3,300,234.0
SURVEYDB5_BA  Bkup  \\surveydb5\e$  7  FULLFILESQL_SEQ_BU   13         0.01           0.01
SURVEYDB5_BA  Bkup  \\surveydb5\e$  7  TAPE02_OFFSITE       13         0.01           0.01

Versions:
Server: Linux/x86_64 - Version 6, Release 3, Level 2.200

BA clients:
DEVDB3_BA: Version 6, release 3, level 0.0
SURVEYDB3_BA: Version 6, release 3, level 0.0
SURVEYDB4_BA: Version 6, release 3, level 0.0
SURVEYDB5_BA: Version 7, release 1, level 1.3

I'm starting to think I need to get hold of IBM...
 
Yes get hold of IBM - they have the tools to dig deeper (tools on the DB2 side) to determine what is going on. Who knows, this may be another bug that can be resolved with a later level of the TSM Server probably at 6.3.4 ot 6.4.x, and the clients updated to 6.3.2 or 6.4.2
 
Just as you wrote that I clicked "Create new service request".. :)

I'll report back what I find out - hopefully..
 
Oh dear..
It seems that at least I've had a slight misunderstanding of the Retain Extra Versions field..

The documentation clearly states:

Code:
Retain Extra Versions
  The number of days to retain a backup version after that
  version becomes inactive.

.. the key word here is AFTER.

I was given this explanation:
Code:
So in your case the backup from 02/02/2014 didn't become inactive until
03/04/2014. It will then expire 378 days later. I would therefore not
expect it to expire until the 17th of this month. (ie tomorrow - if my
maths is correct!)

.. and lo and behold this morning I could find this in the activitylog:
Code:
03/17/2015 04:45:27  ANR0166I Inventory file expiration finished processing for
  node SURVEYDB4_BA, filespace \\surveydb4\e$, copygroup
  BACKUP and object type FILE with processing statistics:
  examined 1, deleted 1, retrying 0, and failed 0.
  (SESSION: 924593, PROCESS: 17340)

Doing a dsmc restore -pick -inactive no longer shows the version from 02/02/2014 - which is what I expected.

It seems I need to tweak the settings for the management class and remove 28 days from RETExtra - so older versions are expired a month earlier.


However this still leaves the question of no expiration process beeing run for certain filesystems - I'm still waiting for a reply to that.
 
Back
Top