ADSM-L

Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool

2011-09-29 10:36:16
Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for pirmary pool
From: Rick Adamson <RickAdamson AT WINN-DIXIE DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 29 Sep 2011 10:27:57 -0400
Richard, excellent comments!

I will add that to TSM is just storage and has no idea about the deduplication, 
compression, etc. that DD performs, thus making it challenging to determine the 
actual storage utilization from an individual client and/or file space 
perspective. 

Secondly, aside from the preformatted daily system report (autosupport), which 
is not customizable, getting reporting from the DD can be a little challenging 
to say the least.


~Rick


-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Richard Rhodes
Sent: Thursday, September 29, 2011 9:18 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl 
versus file systems for pirmary pool

> So the data is both deduplicated and compressed before you
> send it offsite?

Yes, that is how the DD handles replication.

DD is a inline dedup system.  When data come into the DD
it is deduped, what is left is compressed, then it 
is written to disk. Only the new unique data is replicated. 
(yes, there must be meta data and new unique dedup hashes
must be sent somehow).  In general, the replication 
data stream reflects the dedup/compression ratio.

> 
> Does the DD do the dedup within the same box, or require a separate 
> box for dedup?

A DD is nothing more than a powererful pc server with 
lots of memory, SATA disks, Linux OS.  The secret sauce is the code to 
handle dedup, compression, replication, nfs, cifs, vtl,
log structured filesystem, snapshots,  etc, etc.
 
> You're also running with the same risk as the previous poster, 
> you're relying entirely on the fact that your DD setup wont break. 

There is a security in tapes pieces/parts. 
A drive can fail but the rest keep 
running.  A cartridge can get chewed up but it's only one
cartridge.  (We have 2 DD's, but also still have two large
3584 libraries). 

If a DD were to have a complete meltdown all backups on it are gone.

This is true and something you have to come to grips with
if moving to any disk based backup system.  As has been
mentioned it's a question if risk and cost.  You could
have dual onsite DD's with one for primary pool and a second 
for a TSM copy pool, but that doubles your cost.  I will say
that from what I see of our DD's, DD put a lot of time/effort
into making the box highly reliable.

Now, we implemented ours with a front end disk pool.  The
main reason is that we still wanted backups to not rely 
directly on the availability of the DD.  If the DD is down
for some reason (code upgrade, processor broke, etc) then
backup still run.

> Is this how the DD is sold? (Buy 2 DD's, replicate between them and 
> you're safe) ? 

You can run two DD and use it's replication.  You can also use it
as just a primary pool with a normal copy pool on tape. 
A DD (or any dedup system) doesn't change TSM, but it makes
you think hard on how you configure and run TSM.
 
> If DD claims they have "data invunerability" I'd really like to see 
> how they hit 100% protection, since it would be the first system in 
> the world to actually have managed to secure that last 0,0001% risk 
> ;) RAID usually was "secure" until someone made an error, put in a 
> blank disk and forgot to rebuild :)

Agreed.  Ask the vendors for their stats on data loss events!
Don't believe what they say, but ask anyway.

I have to say I am impressed with our DD's (ouch, that hurt! It
also shows that EMC didn't design it.). 
It runs it's own
log based filesystem (new data is always appended on the end, 
not updated in place) which required periodic (weekly) compactions.
Has snapshots.  It has checksums built in, and runs on Raid6.  Remember 
that 
since it's inline dedup/compression, it doesn't get as high
I/O load on the actual spindles as a straight filesystem would.
They truly did design it to make sure your data is safe.  Of 
course . . .all it takes is a firmware bug to destroy everything!

What we decided is that a major data loss event on the DD will trigger a
disaster situation for the TSM system.

Rick

> 
> Best Regards
> 
> Daniel
> 
> 
> 
> Daniel Sparrman
> Exist i Stockholm AB
> Växel: 08-754 98 00
> Fax: 08-754 97 30
> daniel.sparrman AT exist DOT se
> http://www.existgruppen.se
> Posthusgatan 1 761 30 NORRTÄLJE
> 
> 
> 
> -----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----
> 
> 
> Till: ADSM-L AT VM.MARIST DOT EDU
> Från: Shawn Drew <shawn.drew AT AMERICAS.BNPPARIBAS DOT COM>
> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> Datum: 09/28/2011 22:26
> Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus 
> file systems for pirmary pool
> 
> We average between 15-20TB/day at our main site, and that goes directly 
to 
> a single DD890 (no random pool) .  single-pool, file devclass, NFS 
mounted 
> on 2x10GB crossover connections. Replicates over a 1gb WAN link to 
another 
> DD890.   (I spent all the money on the DD boxes, I didn't have enough 
left 
> over for 10GB switches!)
> 
> That other DD890 backs up another 7-10TB/day, replicating to the main 
site 
>    (bi-directional replication). 
> 
> All with file devclasses and there is not more than a one hour lag in 
> replication by the time I show up in the morning.    TSM doesn't have to 

> do replication or backup stgpools anymore, so I can actually afford to 
do 
> full db backups every day now.  (I was doing an incremental scheme 
before)
> 
> IBM has a similar "recommended" configuration with their Protectier 
> solution, so they do support a single pool, backend replication 
solution. 
> Data Domain also claims that "data invulnerability" which should catch 
any 
> data corruption issue as soon as the data is written, and not later, 
when 
> you try and restore. 
> 
> 
> Regards, 
> Shawn
> ________________________________________________
> Shawn Drew
> 
> 
> 
> 
> 
> Internet
> daniel.sparrman AT EXIST DOT SE
> 
> Sent by: ADSM-L AT VM.MARIST DOT EDU
> 09/28/2011 02:13 AM
> Please respond to
> ADSM-L AT VM.MARIST DOT EDU
> 
> 
> To
> ADSM-L
> cc
> 
> Subject
> [ADSM-L] Ang: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for 

> pirmary pool
> 
> 
> 
> 
> 
> 
> How many TB of data is common in this configuration? In a large 
> environment, where databases are 5-10TB each and you have a demand to 
> backup 5-10-15-20TB of data each night, this would require you to have 
> 10Gbs for every host, something that would also cost a penny. Especially 

> since the DD needs to be configured to have the throughput to write all 
> those TB within a limited amount of time.
> 
> Does the DD do de-dup within the same box (meaning, can I have 1 box 
that 
> handles normal storage and does de-dup) or do I need a 2nd box?
> 
> And the same issue also arises with the filepool, you're moving alot of 
> data around completely unnecessary every day when u do reclaim. 
> 
> If I'm right, it also sounds like (in your description from the previous 

> mails) you're not only using the DD for TSM storage. That sounds like 
> putting all the eggs in the same basket.
> 
> Best Regards
> 
> Daniel
> 
> 
> 
> Daniel Sparrman
> Exist i Stockholm AB
> Växel: 08-754 98 00
> Fax: 08-754 97 30
> daniel.sparrman AT exist DOT se
> http://www.existgruppen.se
> Posthusgatan 1 761 30 NORRTÄLJE
> 
> 
> 
> -----"ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU> skrev: -----
> 
> 
> Till: ADSM-L AT VM.MARIST DOT EDU
> Från: "Allen S. Rout" <asr AT UFL DOT EDU>
> Sänt av: "ADSM: Dist Stor Manager" <ADSM-L AT VM.MARIST DOT EDU>
> Datum: 09/27/2011 18:55
> Ärende: Re: [ADSM-L] Ang: Re: [ADSM-L] vtl versus file systems for 
pirmary 
> pool
> 
> On 09/27/2011 12:02 PM, Rick Adamson wrote:
> 
> 
> > The bigger question I have is since the file based storage is
>  > native to TSM why exactly is using a file based storage
>  > not supported?
> 
> Not supported by what?
> 
> If you've got a DD, then the simplest way to connect it to TSM is via
> files.  Some backup apps require something that looks like a library, in
> which case you'd be buying the VTL license.
> 
> FWIW, if you're already in DD space, you're paying a pretty penny.  The
> VTL license isn't chicken feed, I agree, but it's not a major component
> of the total cost.
> 
> 
> - Allen S. Rout
> 
> 
> This message and any attachments (the "message") is intended solely for 
> the addressees and is confidential. If you receive this message in 
error, 
> please delete it and immediately notify the sender. Any use not in 
accord 
> with its purpose, any dissemination or disclosure, either whole or 
partial, 
> is prohibited except formal approval. The internet can not guarantee the 

> integrity of this message. BNP PARIBAS (and its subsidiaries) shall 
(will) 
> not therefore be liable for the message if modified. Please note that 
certain 
> functions and services for BNP Paribas may be performed by BNP 
> Paribas RCC, Inc.


-----------------------------------------
The information contained in this message is intended only for the
personal and confidential use of the recipient(s) named above. If
the reader of this message is not the intended recipient or an
agent responsible for delivering it to the intended recipient, you
are hereby notified that you have received this document in error
and that any review, dissemination, distribution, or copying of
this message is strictly prohibited. If you have received this
communication in error, please notify us immediately, and delete
the original message.

<Prev in Thread] Current Thread [Next in Thread>