Re: [ADSM-L] Deduplication with TSM.

The size of storage is not enough information to size a system.
The number of sessions determines system size.

If you have four clients, 1 gig per night, you could run 8GB RAM, Core2
2GHz and be okay.

Realistically, 32GB per instance is good.  db2sysc will use about 20GB per
instance if it's available.
This handles a couple hundred clients, deduplication, etc.

As for processors, one processor for every 2 DB directories, plus one
processor for TSM internals is minimal.
If you will have high I/O, then one hardware thread for every IDENTIFY
DUPLICATES process is good.
If you will use client-side dedupe most of the time, then you you only use
IDENTIFY DUPLICATES when you move data into the pool server side.

Higher GHz matters for the identify processes, though branch prediction is
still important (POWER5 or POWER7 are better than POWER6)
Higher hardware thread counts matter for client session responsiveness
(POWER7 uses fewer cores than POWER6/POWER5)
Higher I/O backplane matters for amount of raw data coming in (Low end
POWER wins over low-end Intel)
Higher IOPS for the DB volumes are necessary to keep clients from slowing
down.  (hash compares)
Lower network latency matters for client performance (hash compares)

With friendly Regards,
Josh-Daniel S. Davis
OmniTech Industries




On Thu, Apr 26, 2012 at 8:43 AM, Francisco Molero <fmolero AT yahoo DOT com> 
wrote:

> Hi colleagues,
>
>
> I am going to implement a very big disk pool with dedup around 100 TB. TSM
> disk Storage pool ( neither VTLs nor DataDomain) . Somebody knows what TSM
> server I need  ( RAM and CPU) or what ratio can I hope... I am thinking
> about source Dedup...
>
>
> Any experiences?
>
>
> Thanks..
>