Results 1 to 8 of 8
  1. #1
    Member
    Join Date
    May 2006
    Posts
    199
    Thanks
    3
    Thanked 6 Times in 5 Posts

    Default TSM Deduplication Limitations ?

    I have heard some comments from local TSM folks that TSM 6 dedup is only usable on pools up to 5-6 TB in size. While I didn't get a lot of details, it sounded like they ended up cpu bound.

    I am curious if anyone here has had positive (or negative) experiences with TSM deduplication and large storage pools. We are looking at the possibility of replacing deduplicating VTL's with large disk pools. It would be far less expensive, and less complicated than the VTL route if TSM deduplication is usable on a large scale.

    To give you an idea of how large "large" is, we have some hosts that have occupancy numbers in TSM in the 80 - 100 TB range. Nearly 4 PB of total occupancy in our TSM backup environment. On average we move 60-80TB a day of backups to TSM.

    The server platform we are considering is the Sun x4540. 12 cores, 32 GB RAM, 48 1.5 TB drives behind a zfs file system. With this platform each server adds more cpu, RAM, and capacity to the environment, so the hope is this will help to scale up the cpu\memory\storage bandwidth needed for deduplication.

    Thanks,
    -Rowl

  2. #2
    Senior Member Jeff_Jeske's Avatar
    Join Date
    Jul 2006
    Location
    Stevens Point, WI
    Posts
    485
    Thanks
    2
    Thanked 0 Times in 0 Posts

    Default

    I can't answer your question. But I'm surprised you would want to put all your eggs in one basket (one big TSM server).

    We are not as big as you and we run windows. Our approach to move away from the VTL was to buy more less expensive servers and spread out the processing load, bandwidth, disaster exposure.

    We have a couple SATA only TSM servers. They work great but when compared to the VTL there are additional exposures to consider.

    The OS has access to all TSM data. A virus or some form of corruption could whack both storage pools. Adding many LUNs to a server makes managment of those LUNs very sensative. Your server and tsm admins best pay very good attention to detail. You should also think about the max number of LUNs your server will be happy with.

    The VTLs were designed for a specific misison and they do it very well. When moving to self managed disk you'll need to design your own raid, spindal count, LUN size, ect... you might not see the gains you expect to see.

  3. #3
    Senior Member javajockey's Avatar
    Join Date
    Dec 2007
    Location
    Yorktown
    Posts
    265
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    The OS has access to all TSM data. A virus or some form of corruption could whack both storage pools.
    That was one of the deciding factors for management to go with AIX in my enviroment



    Seriously, is anyone using TSM for deduping in large environments? I'm stuck with the same delima

  4. #4
    Member
    Join Date
    Mar 2008
    Posts
    32
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    That 6TB figure is how much backup data a TSM server could effectively dedupe per day. It's going to keep track of what data has been deduped in the storage pool and won't have to scan the entire storage pool every day. You just need to keep in mind that the processing of 6TB backup data could take up to 18 hours (depending on the number of CPUs / disk speed etc). So if your system isn't fast enough then you will never dedupe all the data coming in and could fall behind / use up too much storage space.

    TSM dedupe is probably realistic if your TSM server backs up 2 - 4 TB per day. But then you also don't get dedupe across storage pools / TSM servers either whereas you would with one of the various dedupe appliances out there.

  5. #5
    Senior Member javajockey's Avatar
    Join Date
    Dec 2007
    Location
    Yorktown
    Posts
    265
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Thanks for clearing that up Canuck. So I guess that you could probably postpone the dedup process until the weekend. My servers are usually just running reclaimation during that off time.

  6. #6
    Member
    Join Date
    Mar 2008
    Posts
    32
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Leaving it to the weekend could cost you quite a lot of storage though...With TSM dedupe you need that 'extra landing zone' of storage for the nightly backup, then it runs its dedupe process on the volumes (throwing out redundant chunks) and after dedupe and expiration TSM needs to run reclamation to recover the space. You don't actually lower your 'used' storage until it completely runs through the dedupe / reclamation processes.

  7. #7
    Member
    Join Date
    Feb 2007
    Location
    Singapore
    Posts
    66
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    I have used de-dup in my environment for 4-5 TB of data.
    Till now I am fighting to get it work properly.

    De-dup works fine but it makes expiration to hang. When this happens all other processes and backups also hang. Something to do with TSM Server resource conflict.

    Opened a pmr and they are working on a diag server for us.

    So if you plan to use this in your production, please test it in test environment first.

  8. #8
    Member
    Join Date
    May 2006
    Posts
    199
    Thanks
    3
    Thanked 6 Times in 5 Posts

    Default

    Thanks for all the feed back on this topic. To clarify one point I am not looking to create one big TSM server. This platform is far too small for that, even with deduplication. One of the ongoing struggles has been scaling up cpu/memory/capacity and I/O. The x4540 would be considered a building block, and we would roll out as many as needed to meet our capacity and throughput needs.

    Using 4TB/day as a starting point, this sounds promising. On a per-TSM server basis that is about the average I am measuring. Tape still exists here, so that may well be used as an overflow pool for the data stored on these servers internal disk.

Similar Threads

  1. TSM Data Deduplication
    By pfsubaru in forum TSM Operation
    Replies: 7
    Last Post: 04-01-2010, 03:25 AM
  2. TSM & Data Deduplication
    By influx in forum Others
    Replies: 3
    Last Post: 03-12-2008, 05:24 PM
  3. .dsm file limitations
    By axeman-66 in forum Capacity Planning
    Replies: 3
    Last Post: 01-11-2006, 07:48 AM
  4. NDMP limitations - any solutions?
    By GeoffW in forum Backup / Archive Discussion
    Replies: 10
    Last Post: 12-05-2005, 03:17 PM
  5. TSM Disk Volume Limitations
    By DaveService in forum Others
    Replies: 2
    Last Post: 11-18-2002, 03:26 PM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •