ADSM-L

Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-28 14:48:57
Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
From: "Colwell, William F." <bcolwell AT DRAPER DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 28 Sep 2011 14:43:49 -0400
Hi Daniel,

 

I remember hearing about a 6 TB limit for dedup in a webinar or conference call,

but what I recall is that that was a daily thruput limit.  In the same section 
of the

redbook as you quote is this paragraph -

 

Experienced administrators already know that Tivoli Storage Manager database 
expiration

was one of the more processor-intensive activities on a Tivoli Storage Manager 
Server.

Expiration is still processor intensive, albeit less so in Tivoli Storage 
Manager V6.1, but this is

now second to deduplication in terms of consumption of processor cycles. 
Calculating the

MD5 hash for each object and the SHA1 hash for each chunk is a processor 
intensive activity.

 

I can say this is absolutely correct; my processor is frequently running at or 
near 100%.

 

I have gone way beyond 6 TB of storage for dedup storagepools as this sql shows

for the 2 instances on my server -

 

select cast(stgpool_name as char(12)) as "Stgpool", -

       cast(sum(num_files)     / 1024 /1024 as decimal(4,1)) as "Mil Files", -

       cast(sum(physical_mb)   / 1024 /1024 as decimal(4,1)) as "Physical_TB", -

       cast(sum(logical_mb)    / 1024 /1024 as decimal(4,1))as "Logical_TB", -

       cast(sum(reporting_mb)  / 1024 /1024 as decimal(4,1))as "Reporting_TB" -

from occupancy -

  where stgpool_name in (select stgpool_name from stgpools where deduplicate = 
'YES') -

   group by stgpool_name

 

 

Stgpool            Mil Files      Physical_TB      Logical_TB      Reporting_TB

-------------     ----------     ------------     -----------     -------------

BKP_2                  368.0              0.0            30.0              95.8

BKP_2X                 341.0              0.0            23.9              58.6

 

 

Stgpool            Mil Files      Physical_TB      Logical_TB      Reporting_TB

-------------     ----------     ------------     -----------     -------------

BKP_2                  224.0              0.0            35.7              74.1

BKP_FS_2                49.0              0.0            21.0              45.5

 

 

Also, I am not using any random disk pool, all the disk storage is scratch 
allocated

file class volumes.  There is also a tape library (lto5) for files larger than 
1GB

which are excluded from deduplication.

 

 

Regards,

 

Bill Colwell

Draper Lab

 

 

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Daniel Sparrman
Sent: Wednesday, September 28, 2011 3:49 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

 

To be honest, it doesnt really say. The information is from the Tivoli Storage 
Manager Technical Guide:

 

Note: In terms of sizing Tivoli Storage Manager V6.1 deduplication, we currently

recommend using Tivoli Storage Manager to deduplicate up to 6 TB total of 
storage pool

space for the deduplicated pools. This is a rule of thumb only and exists 
solely to give an

indication of where to start investigating VTL or filer deduplication. The 
reason that a

particular figure is mentioned is for guidance in typical scenarios on 
commodity hardware.

If more than 6 TB of real diskspace is to be duplicated, you can either use 
Tivoli Storage

Manager or a hardware deduplication device. The 6 TB is in addition to whatever 
disk is

required by non-deduplicated storage pools. This rule of thumb will change as 
processor

and disk technologies advance, because the recommendation is not an 
architectural,

support, or testing limit.

 

http://www.redbooks.ibm.com/redbooks/pdfs/sg247718.pdf

 

I'm guessing it's server-side since client-side shouldnt use any resources @ 
the server. I'm also guessing you could do 8TB or 10, but not 60TB.

 

Best Regards

 

Daniel Sparrman

 

 

 

Daniel Sparrman

Exist i Stockholm AB

Växel: 08-754 98 00

Fax: 08-754 97 30

daniel.sparrman AT exist DOT se

http://www.existgruppen.se

Posthusgatan 1 761 30 NORRTÄLJE

 

 

 

-----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----

 

 

Till: ADSM-L AT VM.MARIST DOT EDU

Från: Hans Christian Riksheim <bullhcr AT GMAIL DOT COM>

Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>

Datum: 09/28/2011 09:56

Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file 
systems for pirmary pool

 

This 6 TB supported limit for deduplicated FILEPOOL does this limit

apply when one does client side deduplication only?

 

Just wondering since I have just set up a 30 TB FILEPOOL for this purpose.

 

Regards

 

Hans Chr.

 

On Tue, Sep 27, 2011 at 8:44 PM, Daniel Sparrman

<daniel.sparrman AT exist DOT se> wrote:

> Just to put an end to this discussion, we're kinda running out of limits here:

> 

> a) No VTL solution, neither DD, neither Sepaton, neither anyone, is a 
> replacement for random diskpools. Doesnt matter if you can configure 50 
> drives, 500 drives or 5000 drives, the way TSM works, you're gonna make the 
> system go bad since the system is made from having random pools infront, 
> sequential pools in the back.  A sequential device is not gonna replace that, 
> independent being a sequential file pool or a VTL (or, for that question, a 
> tape library).

> 

> b) VTL's where invented because most backup software (I've only worked with 
> TSM, Legato & Veritas aka Symantec) is used to working with sequential 
> devices. That havent changed, and wont change in the near future. VTL's (and 
> the file device option) is just a replacement. Performance wise, VTL's are 
> gonna win all the time compared to a file device, question you need to ask 
> yourself is, do I need the VTL, or can I go along with using file devices. 
> According to the TSM manual (dont have the link , but if you want i'll find 
> it) the maximum supported file device pool for deduplication is 6TB... so if 
> you're thinking of replacing a VTL with a seq. file pool, keep that in mind. 
> The limit is because the amount of resources needed by TSM to do the file 
> deduplication is limited, or as the manual says, "until new technologies are 
> available".

> 

> The discussion here where people are actually planning on just having a 
> sequential pool (since noone is actually discussing that there's a random 
> pool infront) is plain scary. No sequential device is gonna have their time 
> of the life having a fileserver serving 50K blocks at a time.

> 

> So my last 50 cents worth is:

> 

> a) Have a random pool infront

> 

> b) Depending on the size of your environment, you're either gonna go with a 
> filepool and use de-dup (limit is 6TB for each pool, you might not want to 
> de-dup everything), or you're gonna go with a fullscale VTL. Choice here is 
> size vs costs.

> 

> I've seen alot of posts here lately about the disadvantages with VTL's .. 
> well, I havent seen one this far with mine. I have a colleague who bought a 
> XXXX VTL and found out he needed another VTL just todo the de-dup, since one 
> VTL wasnt a supported configuration to do de-dup. I have another colleague 
> who bought a very cheap VTL solution (from a very mentioned name around here) 
> and ended up with having same hashes, but different data, leaving him with 
> unrestorable data.

> 

> Comparing eggs to apples just isnt fair.  Different manufactures of VTL's do 
> different things, meaning both performance and availability is completely 
> different.

> 

> Just to sum up, we've had both 3584's and (back in the days) 3575, and I've 
> never been happier with our VTL (and yes, we do restore tests).

> 

> Best Regards

> 

> Daniel

> 

> 

> 

> Daniel Sparrman

> Exist i Stockholm AB

> Växel: 08-754 98 00

> Fax: 08-754 97 30

> daniel.sparrman AT exist DOT se

> http://www.existgruppen.se

> Posthusgatan 1 761 30 NORRTÄLJE

> 

> 

> 

> -----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----

> 

> 

> Till: ADSM-L AT VM.MARIST DOT EDU

> Från: Rick Adamson <RickAdamson AT WINN-DIXIE DOT COM>

> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>

> Datum: 09/27/2011 18:02

> Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary 
> pool

> 

> Interesting. Every VTL based solution, including data domain, that I looked 
> at had limits on the amount of drives that could be emulated which were 
> nowhere near a hundred let alone a thousand. Perhaps it's time to revisit 
> this.

> 

> The license is a data domain fee, and a hefty one at that.

> 

> The bigger question I have is since the file based storage is native to TSM 
> why exactly is using a file based storage not supported?

> 

> ~Rick

> 

> 

> -----Original Message-----

> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf 
> Of Daniel Sparrman

> Sent: Tuesday, September 27, 2011 10:30 AM

> To: ADSM-L AT VM.MARIST DOT EDU

> Subject: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

> 

> Not really sure where the general idea that a VTL will limit the number of 
> available mount points.

> 

> I'm not familiar with Data Domain, but generally speaking, the number of 
> virtual tape drives configured within a VTL is usually thousands. Not sure 
> why you'd want that many though, I always prefer having a small diskpool 
> infront of whatever sequential pool I have, and let the bigger files pass the 
> diskpoool and go straightly to the seq. pool.

> 

> As far as for LAN-free, the only available option I know of is SANergy. And 
> going down that road (concerning both price & complexity) will probably make 
> the VTL look cheap.

> 

> Not sure what kind of licensing you're talking about concerning VTL, but I 
> assume it's a Data Domain license and not a TSM license?

> 

> Best Regards

> 

> Daniel Sparrman

> 

> 

> 

> Daniel Sparrman

> Exist i Stockholm AB

> Växel: 08-754 98 00

> Fax: 08-754 97 30

> daniel.sparrman AT exist DOT se

> http://www.existgruppen.se

> Posthusgatan 1 761 30 NORRTÄLJE

> 

> 

> 

> -----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----

> 

> 

> Till: ADSM-L AT VM.MARIST DOT EDU

> Från: Rick Adamson <RickAdamson AT WINN-DIXIE DOT COM>

> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>

> Datum: 09/27/2011 16:52

> Ärende: Re: [ADSM-L] vtl versus file systems for pirmary pool

> 

> A couple of things that I did not see mentioned here which I experienced

> was.... for Data Domain the VTL is an additional license and it does

> limit the available mount points (or emulated drives), where a TSM file

> based pool does not. Like Wanda stated earlier depends what you can

> afford !

> 

> I myself have grown fond of using the file based approach, easy to

> manage, easy to configure, and never worry about an available tape drive

> (virtual or otherwise). The lan-free issue is something to consider but

> from what I have heard lately is that it can still be accomplished using

> the file based storage. If anyone has any info on it I would appreciate

> it.

> 

> ~Rick

> Jax, Fl.

> 

> -----Original Message-----

> From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf 
> Of

> Tim Brown

> Sent: Monday, September 26, 2011 4:05 PM

> To: ADSM-L AT VM.MARIST DOT EDU

> Subject: [ADSM-L] vtl versus file systems for pirmary pool

> 

> What advantage does VTL emulation on a disk primary storage pool have

> 

> as compared to disk storage pool that is non vtl ?

> 

> 

> 

> It appears to me that a non vtl system would not require the daily

> reclamation process

> 

> and also allow for more client backups to occur simultaneously.

> 

> 

> 

> Thanks,

> 

> 

> 

> Tim Brown

> Systems Specialist - Project Leader

> Central Hudson Gas & Electric

> 284 South Ave

> Poughkeepsie, NY 12601

> Email: tbrown AT cenhud DOT com <<mailto:tbrown AT cenhud DOT com>>

> Phone: 845-486-5643

> Fax: 845-486-5921

> Cell: 845-235-4255

> 

> 

> 

> 

> This message contains confidential information and is only for the

> intended recipient. If the reader of this message is not the intended

> recipient, or an employee or agent responsible for delivering this

> message to the intended recipient, please notify the sender immediately

> by replying to this note and deleting all copies and attachments.

<Prev in Thread] Current Thread [Next in Thread>