ADSM-L

[ADSM-L] DATA Corruption using Deduplication in TSM 6.3.1.1 WARNING

2012-05-29 14:51:41
Subject: [ADSM-L] DATA Corruption using Deduplication in TSM 6.3.1.1 WARNING
From: Ray Carlson <rlcarlson AT ANL DOT GOV>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 29 May 2012 13:46:09 -0500
Or as IBM called it, "Orphaned deduplicate references".  

We are running TSM 6.3.1.1 on a Windows 2008 Server, and using the Identify 
command to do deduplication on the Server, not the client.

Interestingly, everything seemed to be mostly working.  We had a few volumes 
that would not be reclaimed or moved because it said the deduplicated data had 
not been backed up to the copy pool, but that was jut an annoyance.  

Then we discovered that we could not do restores of various servers.  The error 
we got was: 
"05/21/2012 20:52:45 ANR9999D_2547000324 bfRtrv(bfrtrv.c:1161) Thread<129>: 
Error 9999 obtaining deduplication information for object 254560532 in super 
bitfile 664355697 in pool 7 (SESSION: 8235, PROCESS: 375)".

A Severity 1 trouble ticket was opened with IBM back on 5/21 and various 
information was gathered and provided to IBM.  So far IBM has not been able to 
identify the root cause or provide a fix.  They have transferred the ticket to 
the Development team.

So here I sit, not knowing which servers, if any, I could restore if needed.  
Unfortunately, most operations appear to be fine and report Success.  Only when 
I try to do a Generate Backupset, or do a Restore, do I discover that there is 
a problem and the job fails.  Also, it doesn't just skip the file/files that it 
can't restore and restore everything else, it simply stops the restore and says 
it failed.

I'm wondering how many other people are in the same situation, but do not 
realize it.  

BEWARE Deduplication 

Ray Carlson