Re: [ADSM-L] Stupid question about TSM server-side dedup

Wanda,

when id dup finds duplicate chunks in the same storagepool, it will
raise the pct_reclaim
value for the volume it is working on.  If the pct_reclaim isn't going
up, that means there
are no duplicate chunks being found.  Id dup is still chunking the
backups up (watch you database grow!)
but all the chunks are unique.

Is it possible that the ndmp agent in the storage appliance is putting
in unique metadata with each file?
This would make every backup appear to be unique in chunk-speak.

I remember from the v6 beta that the standard v6 clients were enhanced
so that the metadata could
be better identified by id dup and skipped over so that it could just
work on the files and get
better dedup ratios.  If id dup doesn't know how to skip over the
metadata in an ndmp stream, and
the metadata is always changed, then you will get very low dedup ratios.

If you do a 'q pr' while the id dup is running, do the processes say
they are finding duplicates?

Bill Colwell
Draper Lab

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of
Prather, Wanda
Sent: Monday, November 21, 2011 11:41 PM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Stupid question about TSM server-side dedup

Have a customer would like to go all disk backups using TSM dedup.  This
would be a benefit to them in several respects, not the least in having
the ability to replicate to another TSM server using the features in
6.3.

The customer has a requirement to keep their NDMP dumps 6 months.  (I
know that's not desirable, but the backup group has no choice in the
matter right now, it's imposed by a higher level of management.)

The NDMP dumps come via TCP/IP into a regular TSM sequential filepool.
They should dedup like crazy, but client-side dedup is not an option (as
there is no client).

So here's the question.  NDMP backups come into the filepool and
identify duplicates is running.  But because of those long retention
times, all the volumes in the filepool are FULL, but 0% reclaimable, and
they will continue to be that way for 6 months, as no dumps will expire
until then.  Since the dedup occurs as part of reclaim, and the volumes
won't reclaim -how do we "prime the pump" and get this data to dedup?
Should we do a few MOVE DATAs to get the volumes partially empty?


Wanda Prather  |  Senior Technical Specialist  |
wprather AT icfi DOT com<mailto:wprather AT icfi DOT com>  |
www.icf.com<http://www.icf.com>
ICF International  | 401 E. Pratt St, Suite 2214, Baltimore, MD 21202 |
410.539.1135 (o)
Connect with us on social media<http://www.icfi.com/social>