I'm running TSM 6.2 server. Daily backups are to disk storagepool, in addition we also run archive at the end of each month, preserving files for 190 days in the archive. Currently the archive goes directly to LTO2 tapes, but I'm thinking of using the disk also for archive. Much of the archived data is identical at the end of each month, so this seems like a good candidate for deduplication which I haven't used until now.
During reading the TSM 6.2 documentation about deduplication, I found this:
Erm... what? In order to have a deduplicated storage pool, I need to have another storage pool with the same data in non-deduplicated form? Surely I must be misunderstanding something, because this seems to defeat any possible space savings from deduplication. Can someone please explain this?
During reading the TSM 6.2 documentation about deduplication, I found this:
By default, primary sequential-access storage pools that are set up for data deduplication must be backed up to a copy storage pool before they can be reclaimed and duplicate data can be removed. To minimize the potential of data loss, do not change the default setting.
To protect the data in primary storage pools, issue the BACKUP STGPOOL command to copy the data to copy storage pools. Ensure that the copy storage pools are not configured for data deduplication. During storage pool backup to a non-deduplicated storage pool, server-side and client side extents are reassembled into contiguous files.
Erm... what? In order to have a deduplicated storage pool, I need to have another storage pool with the same data in non-deduplicated form? Surely I must be misunderstanding something, because this seems to defeat any possible space savings from deduplication. Can someone please explain this?