Results 1 to 6 of 6
  1. #1
    Member
    Join Date
    Jun 2003
    Posts
    43
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default Data dedupe and offsite copies reclamation problem

    Hi

    This is more a "heads up" than asking for a solution, as IBM has extensively looked into my problem and replied that "everything works as designed and expected." ( See the whole reply below)

    Background:
    We have disk based/file pools for most of our backups that are duplicated.
    Then we have copies of these made to LTO4 tape and sent offsite daily.

    The problem:
    Space reclamation of the offsite copy pool is VERY slow. (Tape-to-tape reclamations are quick)
    In a 5 hour period around 60-90Gb of data is reclaimed.
    The reclamation threshold used to be set to 70% and would finish in the 5 hour reclamation window we had.
    Now only tapes with a 95% reclamation threshold finish in that window.
    This has off course wrecked havoc with the tapes cycles.
    We had to almost double the amount of tapes in the offsite copy pool in order to have enough vault retrieves coming back to be used for the next cycle of copies.


    Below is IBM's response after weeks of troubleshooting:
    the development team review all serverperformance traces we gathered in May. Run tests on their machines and review the offsite
    reclamation code.
    From TSM development point of view, there is nothing else they can do as no defect was found and everything works as designed and
    expected.

  2. #2
    Member
    Join Date
    Jun 2004
    Location
    Kansas
    Posts
    204
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Curiosity --- On what type of storage is your disk based/file pools? We have a NetApp device and when you move data from one location on the NetApp to another location it is very slow.

  3. #3
    Member
    Join Date
    Jun 2003
    Posts
    43
    Thanks
    0
    Thanked 0 Times in 0 Posts

    Default

    Quote Originally Posted by fuzzballer View Post
    Curiosity --- On what type of storage is your disk based/file pools? We have a NetApp device and when you move data from one location on the NetApp to another location it is very slow.
    Our dedupe pools are all on a DS3400 and SATA disks.
    The TSM db is on a DS4700 and fibre disk.
    When you move data from a dedupe pool to a non-dedupe pool TSM has to reassemble/rebuild the file. This is where the bottleneck is. If you look at what TSM does when you copy the data you'll see a lot of TSM db activity, when it determines where the pieces are and then some activity on the disk pool when it copies the file.

  4. #4
    Member
    Join Date
    Mar 2009
    Location
    Zagreb, Croatia
    Posts
    65
    Thanks
    0
    Thanked 8 Times in 3 Posts

    Default

    Same here.

    I'm getting similar numbers... yesterday's reclamation: 120GB reclaimed in 4 hours on the following hardware:

    TSM 6.2.3 on RHEL 5.6
    16GB ram
    DB2 on 300GB 15k SAS disks
    2TB 7.2k SATA disks for storage pools
    offsite copy storage pool on LTO-5 library with 2 drives
    active data storage pool as virtual volume on a remote server

    That's about 9 MB/s... not exactly what I expected from $50k worth of hardware.

    I've found that any data movement operation from deduplicated storage pools to non-deduplicated ones runs ridiculously slow. It hogs the database disks with 8k random reads. Generating backup sets and exporting data runs slowly as well.

    Maybe putting the DB on a fast mirrored SSD would make deduplication more usable. I wish I anticipated this... now we're using about 2x more tapes than I thought we'd be using.

    Here's what I did to make offsite reclamation somewhat manageable: set up a script that starts reclamation with threshold 99 and offsitereclaimlimit=1, wait till it finishes, then start new reclaim process with threshold 98 and so on... that way only one tape at a time is reclaimed. If you simply specify a threshold of 50 and there are more volumes that satisfy this criteria, it may well happen that none of them get fully reclaimed.

    Also, upgrade to 6.2.3 and use the new server options to disable DB2 reorgs while the reclamation process runs. This helped a bit too. I actually disabled reorgs altogether and do them manually while the server is stopped. Yeah, nuke the site from orbit... it's the only way to be sure

    In conclusion, I wouldn't recommend mixing deduped and non-deduped storage pools such as tape unless you are working with small data sets (a few TB) or you have some REALLY fancy hardware (multiple SSDs for DB? RAM size greater than DB size? RAMSAN?).

    I'm still hoping that IBM forgot to put a proper index on some table (or put too many) and that they'll fix this in some future fixpack. Until then... I'm considering turning off deduplication and using tapes as a second-tier primary storage.

    P.S.
    Another disappointment for deduplication: It turns out that deduplication of virtual volumes works nowhere near as good as deduplication of files... my primary storage pool had about 60% space saved while on the remote virtual volumes I only had about 10-15% space saved. I'm guessing this is because files have different offsets when stored in virtual volumes... dedup didn't pick up much of what could be deduplicated. I had to give up on a remote copy pool and now I'm using active data pool with grace period of 7 days... so now it's like a copy pool with different retention. Meh.

    P.P.S.

    ...everything works as designed and expected.
    Um... "In retrospect, our design sucks a bit. Sorry about that." ?

  5. #5
    Senior Member rore's Avatar
    Join Date
    Nov 2005
    Location
    Montreal, CA
    Posts
    637
    Thanks
    0
    Thanked 4 Times in 4 Posts

    Default

    Quote Originally Posted by terlisimo View Post

    P.S.
    Another disappointment for deduplication: It turns out that deduplication of virtual volumes works nowhere near as good as deduplication of files... my primary storage pool had about 60% space saved while on the remote virtual volumes I only had about 10-15% space saved. I'm guessing this is because files have different offsets when stored in virtual volumes... dedup didn't pick up much of what could be deduplicated. I had to give up on a remote copy pool and now I'm using active data pool with grace period of 7 days... so now it's like a copy pool with different retention. Meh.

    P.P.S.



    Um... "In retrospect, our design sucks a bit. Sorry about that." ?
    Had the same issue. Here is IBM support statement about this:

    For normal backup, once the client files are same, the same data will be store on server, so we can
    see high dedup ratio. But for virtual volume, a virtual volume is actually stored as an archive
    file on target server. So the archive file on target server contains all data for a volume. And for
    a volume, it's not only the client files inside, we also have a lot of supporting data, like frame
    header, BackInsNorm verb, kind of staffs, which make the volume data to be different each time,
    even we are backing up the same client files. Further more, the client files are broken in to piece
    when storing it, we have the structure in the volume like |frame hdr|data blk|frame hdr|data blk
    ... So it's very likely that we wouldn't get a high dedup ratio on virtual volume data on target
    server.


    They will probably publish a technote.
    Tivoli Request for Enhancement (RFE) site http://www.ibm.com/developerworks/rfe/?BRAND_ID=90

  6. #6
    Senior Member rore's Avatar
    Join Date
    Nov 2005
    Location
    Montreal, CA
    Posts
    637
    Thanks
    0
    Thanked 4 Times in 4 Posts
    Tivoli Request for Enhancement (RFE) site http://www.ibm.com/developerworks/rfe/?BRAND_ID=90

  7. The Following User Says Thank You to rore For This Useful Post:

    JeanSeb (03-13-2012)

Similar Threads

  1. NDMP Backups - Offsite copies
    By Kyle2024 in forum NDMP
    Replies: 5
    Last Post: 05-27-2011, 09:39 AM
  2. Are tape copy pools valid from data domain dedupe file type volumes?
    By jade2058 in forum Disaster Recovery Module
    Replies: 3
    Last Post: 10-30-2010, 06:37 PM
  3. Problem: Data Reclamation Concurrent with Stgpool Backups
    By deany in forum Backup / Archive Discussion
    Replies: 4
    Last Post: 05-30-2007, 08:27 PM
  4. Offsite Reclamation Question / Problem
    By janetg in forum Tape / Media Library
    Replies: 5
    Last Post: 01-10-2007, 11:59 PM
  5. No offsite copies?
    By melfaro in forum Tape / Media Library
    Replies: 1
    Last Post: 01-30-2006, 09:49 AM

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •