Results 1 to 6 of 6
-
08-22-2011, 09:54 AM #1Member
- Join Date
- Jun 2003
- Posts
- 43
- Thanks
- 0
- Thanked 0 Times in 0 Posts
Data dedupe and offsite copies reclamation problem
Hi
This is more a "heads up" than asking for a solution, as IBM has extensively looked into my problem and replied that "everything works as designed and expected." ( See the whole reply below)
Background:
We have disk based/file pools for most of our backups that are duplicated.
Then we have copies of these made to LTO4 tape and sent offsite daily.
The problem:
Space reclamation of the offsite copy pool is VERY slow. (Tape-to-tape reclamations are quick)
In a 5 hour period around 60-90Gb of data is reclaimed.
The reclamation threshold used to be set to 70% and would finish in the 5 hour reclamation window we had.
Now only tapes with a 95% reclamation threshold finish in that window.
This has off course wrecked havoc with the tapes cycles.
We had to almost double the amount of tapes in the offsite copy pool in order to have enough vault retrieves coming back to be used for the next cycle of copies.
Below is IBM's response after weeks of troubleshooting:
the development team review all serverperformance traces we gathered in May. Run tests on their machines and review the offsite
reclamation code.
From TSM development point of view, there is nothing else they can do as no defect was found and everything works as designed and
expected.
-
08-22-2011, 11:24 AM #2Member
- Join Date
- Jun 2004
- Location
- Kansas
- Posts
- 204
- Thanks
- 0
- Thanked 0 Times in 0 Posts
Curiosity --- On what type of storage is your disk based/file pools? We have a NetApp device and when you move data from one location on the NetApp to another location it is very slow.
-
08-22-2011, 11:48 AM #3Member
- Join Date
- Jun 2003
- Posts
- 43
- Thanks
- 0
- Thanked 0 Times in 0 Posts
Our dedupe pools are all on a DS3400 and SATA disks.
The TSM db is on a DS4700 and fibre disk.
When you move data from a dedupe pool to a non-dedupe pool TSM has to reassemble/rebuild the file. This is where the bottleneck is. If you look at what TSM does when you copy the data you'll see a lot of TSM db activity, when it determines where the pieces are and then some activity on the disk pool when it copies the file.
-
08-25-2011, 07:38 PM #4Member
- Join Date
- Mar 2009
- Location
- Zagreb, Croatia
- Posts
- 65
- Thanks
- 0
- Thanked 6 Times in 3 Posts
Same here.
I'm getting similar numbers... yesterday's reclamation: 120GB reclaimed in 4 hours on the following hardware:
TSM 6.2.3 on RHEL 5.6
16GB ram
DB2 on 300GB 15k SAS disks
2TB 7.2k SATA disks for storage pools
offsite copy storage pool on LTO-5 library with 2 drives
active data storage pool as virtual volume on a remote server
That's about 9 MB/s... not exactly what I expected from $50k worth of hardware.
I've found that any data movement operation from deduplicated storage pools to non-deduplicated ones runs ridiculously slow. It hogs the database disks with 8k random reads. Generating backup sets and exporting data runs slowly as well.
Maybe putting the DB on a fast mirrored SSD would make deduplication more usable. I wish I anticipated this... now we're using about 2x more tapes than I thought we'd be using.
Here's what I did to make offsite reclamation somewhat manageable: set up a script that starts reclamation with threshold 99 and offsitereclaimlimit=1, wait till it finishes, then start new reclaim process with threshold 98 and so on... that way only one tape at a time is reclaimed. If you simply specify a threshold of 50 and there are more volumes that satisfy this criteria, it may well happen that none of them get fully reclaimed.
Also, upgrade to 6.2.3 and use the new server options to disable DB2 reorgs while the reclamation process runs. This helped a bit too. I actually disabled reorgs altogether and do them manually while the server is stopped. Yeah, nuke the site from orbit... it's the only way to be sure
In conclusion, I wouldn't recommend mixing deduped and non-deduped storage pools such as tape unless you are working with small data sets (a few TB) or you have some REALLY fancy hardware (multiple SSDs for DB? RAM size greater than DB size? RAMSAN?).
I'm still hoping that IBM forgot to put a proper index on some table (or put too many) and that they'll fix this in some future fixpack. Until then... I'm considering turning off deduplication and using tapes as a second-tier primary storage.
P.S.
Another disappointment for deduplication: It turns out that deduplication of virtual volumes works nowhere near as good as deduplication of files... my primary storage pool had about 60% space saved while on the remote virtual volumes I only had about 10-15% space saved. I'm guessing this is because files have different offsets when stored in virtual volumes... dedup didn't pick up much of what could be deduplicated. I had to give up on a remote copy pool and now I'm using active data pool with grace period of 7 days... so now it's like a copy pool with different retention. Meh.
P.P.S.
Um... "In retrospect, our design sucks a bit. Sorry about that." ?...everything works as designed and expected.
-
10-04-2011, 01:55 PM #5Senior Member
- Join Date
- Nov 2005
- Location
- Montreal, CA
- Posts
- 636
- Thanks
- 0
- Thanked 4 Times in 4 Posts
Had the same issue. Here is IBM support statement about this:
For normal backup, once the client files are same, the same data will be store on server, so we can
see high dedup ratio. But for virtual volume, a virtual volume is actually stored as an archive
file on target server. So the archive file on target server contains all data for a volume. And for
a volume, it's not only the client files inside, we also have a lot of supporting data, like frame
header, BackInsNorm verb, kind of staffs, which make the volume data to be different each time,
even we are backing up the same client files. Further more, the client files are broken in to piece
when storing it, we have the structure in the volume like |frame hdr|data blk|frame hdr|data blk
... So it's very likely that we wouldn't get a high dedup ratio on virtual volume data on target
server.
They will probably publish a technote.Tivoli Request for Enhancement (RFE) site http://www.ibm.com/developerworks/rfe/?BRAND_ID=90
-
10-12-2011, 03:30 PM #6Senior Member
- Join Date
- Nov 2005
- Location
- Montreal, CA
- Posts
- 636
- Thanks
- 0
- Thanked 4 Times in 4 Posts
Tivoli Request for Enhancement (RFE) site http://www.ibm.com/developerworks/rfe/?BRAND_ID=90
-
The Following User Says Thank You to rore For This Useful Post:
JeanSeb (03-13-2012)
Similar Threads
-
NDMP Backups - Offsite copies
By Kyle2024 in forum NDMPReplies: 5Last Post: 05-27-2011, 09:39 AM -
Are tape copy pools valid from data domain dedupe file type volumes?
By jade2058 in forum Disaster Recovery ModuleReplies: 3Last Post: 10-30-2010, 06:37 PM -
Problem: Data Reclamation Concurrent with Stgpool Backups
By deany in forum Backup / Archive DiscussionReplies: 4Last Post: 05-30-2007, 08:27 PM -
Offsite Reclamation Question / Problem
By janetg in forum Tape / Media LibraryReplies: 5Last Post: 01-10-2007, 11:59 PM -
No offsite copies?
By melfaro in forum Tape / Media LibraryReplies: 1Last Post: 01-30-2006, 09:49 AM


Reply With Quote
