ADSM-L

Re: [ADSM-L] Deduplication questions, again

2016-03-22 11:28:22
Subject: Re: [ADSM-L] Deduplication questions, again
From: PAC Brion Arnaud <Arnaud.Brion AT PANALPINA DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Tue, 22 Mar 2016 16:27:32 +0100
Matthew,

Just a question : how do you know the size of pre-dedup data ?

Did you make use of backup reports on each clients to get that information, or 
built some query based on  dedupstats table, or anything else ?

Here again, I cannot seem to find coherent information in TSM output ...

Let's take an example :

q occ CH2RS901 /

Node Name  Type Filespace  FSID Storage      Number of    Physical     Logical
                Name            Pool Name        Files       Space       Space
                                                          Occupied    Occupied
                                                              (MB)        (MB)
---------- ---- ---------- ---- ---------- ----------- ----------- -----------
CH2RS901   Bkup /             4 CONT_STG         8,148           -      165.78

So, from "q occ" output we have 165.78 MB logical space occupied

But:

q dedupstats CONT_STG CH2RS901 / f=d

                         Date/Time: 03/22/16   16:03:30
                 Storage Pool Name: CONT_STG
                         Node Name: CH2RS901
                    Filespace Name: /
                              FSID: 4
                              Type: Bkup
         Total Data Protected (MB): 167
             Total Space Used (MB): 36
            Total Space Saved (MB): 131
           Total Saving Percentage: 78.34
             Deduplication Savings: 137,056,854
          Deduplication Percentage: 78.34
     Non-Deduplicated Extent Count: 8,161
Non-Deduplicated Extent Space Used: 7,903,461
               Unique Extent Count: 6
          Unique Extent Space Used: 100,858
               Shared Extent Count: 3,176
      Shared Extent Data Protected: 166,937,957
          Shared Extent Space Used: 29,783,486
               Compression Savings: 0
            Compression Percentage: 0.00
           Compressed Extent Count: 0
         Uncompressed Extent count: 11,343

If I trust this output, I have backed up 167 MB, and TSM deduped it down to 36 
MB ...

Could anyone explain how it comes that TSM "q occ" finds a "logical space 
occupied" of 165.78 MB ? Shouldn't it be 36 MB ?

The help of "q occ" command states :

Logical Space Occupied (MB)
         The amount of space that is occupied by logical files in the
         file space. Logical space is the space that is actually used to
         store files, excluding empty space within aggregates. For this
         value, 1 MB = 1048576 bytes.

I'm lost here ...

Cheers.

Arnaud


From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Matthew McGeary
Sent: Tuesday, March 22, 2016 2:23 PM
To: ADSM-L AT VM.MARIST DOT EDU<mailto:ADSM-L AT VM.MARIST DOT EDU>
Subject: Re: Deduplication questions, again

Arnaud,

I too am seeing odd percentages where containerpools and dedup is concerned.  I 
have a small remote server pair that protects ~23 TB of pre dedup data, but my 
containerpools show an occupancy of ~10 TB, which should be a data reduction of 
over 50%.  However, a q stg on the containerpool only shows a data reduction 
ratio of 21%.  Of note, I use client-side dedup on all the client nodes at this 
particular site and I think that's mucking up the data reduction numbers on the 
containerpool.  The 21% figure seems to be the reduction AFTER client-side 
dedup, not the total data reduction.

It's confusing.

On the plus side, I just put in the new 7.1.5 code at this site and the 
compression is working well and does not appear to add a noticeable amount  CPU 
cycles during ingest.  Since the install date on the 18th, I've backed up 
around 1 TB pre-dedup and the compression savings are rated at ~400 GB, which 
is pretty impressive.  I'm going to do a test restore today and see how it 
performs but so far so good.
__________________________

Matthew McGeary
Technical Specialist - Infrastructure
PotashCorp
T: (306) 933-8921
www.potashcorp.com<http://www.potashcorp.com>

From:        PAC Brion Arnaud <Arnaud.Brion AT PANALPINA DOT 
COM<mailto:Arnaud.Brion AT PANALPINA DOT COM>>
To:        ADSM-L AT VM.MARIST DOT EDU<mailto:ADSM-L AT VM.MARIST DOT EDU>
Date:        03/22/2016 03:52 AM
Subject:        [ADSM-L] Deduplication questions, again
Sent by:        "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT 
EDU<mailto:ADSM-L AT VM.MARIST DOT EDU>>

________________________________



Hi All,

Another question in regards of TSM container based deduplicated pools ...

Are you experiencing the same behavior than this : using "q stg f=d" targeting 
a deduped container based storage pool, I observe following output :


q stg f=d

                   Storage Pool Name: CONT_STG
                   Storage Pool Type: Primary
                   Device Class Name:
                        Storage Type: DIRECTORY
                          Cloud Type:
                           Cloud URL:
                      Cloud Identity:
                      Cloud Location:
                  Estimated Capacity: 5,087 G
                  Space Trigger Util:
                            Pct Util: 55.8
                            Pct Migr:
                         Pct Logical: 100.0
                        High Mig Pct:

Skipped few lines ...

                          Compressed: No
               Deduplication Savings: 0 (0%)
                 Compression Savings: 0 (0%)
                   Total Space Saved: 0 (0%)
                      Auto-copy Mode:
Contains Data Deduplicated by Client?:
        Maximum Simultaneous Writers: No Limit
             Protection Storage Pool: CONT_STG
             Date of Last Protection: 03/22/16   05:00:27
        Deduplicate Requires Backup?:
                           Encrypted:
                  Space Utilized(MB):

Note the "deduplication savings" output ( 0 %)

However, using "q dedupstats" on the same stgpool, I get following output : 
(just a snippet of it)

               Date/Time: 03/17/16   16:31:24
       Storage Pool Name: CONT_STG
               Node Name: CH1RS901
          Filespace Name: /
                    FSID: 3
                    Type: Bkup
 Total Saving Percentage: 78.11
Total Data Protected (MB): 170

               Date/Time: 03/17/16   16:31:24
       Storage Pool Name: CONT_STG
               Node Name: CH1RS901
          Filespace Name: /usr
                    FSID: 4
                    Type: Bkup
 Total Saving Percentage: 62.25
Total Data Protected (MB): 2,260

How does it come that on one side I witness dedup, but not on the other one ?

Thanks for enlightenments !

Cheers.

Arnaud

<Prev in Thread] Current Thread [Next in Thread>