ADSM-L

Re: [ADSM-L] SV: Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data

2008-01-25 11:21:25
Subject: Re: [ADSM-L] SV: Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data
From: "Kauffman, Tom" <KauffmanT AT NIBCO DOT COM>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Fri, 25 Jan 2008 11:20:40 -0500
I *may* have screwed up the math here -- but IF you can drive some number of 
LTO4 drives at maximum compression and full rated write speed of 350 MB/Second 
for the 800GB per tape, I get 1,322 tape drive days or 3.6 tape drive years for 
the first pass. Not counting mount/dismount time. 1,322 tape drives to do the 
backup in one day. Boy, I hope my math is off! And I'm glad I don't have to do 
the recovery plan for this!

Tom Kauffman
NIBCO, Inc

-----Original Message-----
From: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] On Behalf Of 
Wanda Prather
Sent: Friday, January 25, 2008 10:50 AM
To: ADSM-L AT VM.MARIST DOT EDU
Subject: Re: SV: Seeking thoughts/experiences on backing up large amounts (say 
50 Petabytes) of data

Ooooh, what an INTERESTING problem to have!
I'm not aware of any VTL that even approaches being able to handle this
amount of data.

It's a pretty daunting problem.  If you take LTO4 (at 800 GB per cart),
assume with compression and some un-reclaimed space you'll average about 1
TB of valid data per cart, you're talking 50000 cartridges for the primary
copy.  Whoo.

I suggest you talk to your regional TSM specialist (you can find him/her
through your sales rep) about the changes coming in TSM 6.1

The plans currently include DB2 as the TSM data base and deduplication at
the source for sequential disk pools.

That will get you past the current TSM db limits.  And if the data is likely
to dedup well, you might look at using vast quantities of SATA disk instead
of tape for the primary pool (there is no dedup for tape).   Then you just
have the question of where to put the copy pool....

Please post your experiences back to the list -- very interesting!

On 1/24/08, Daniel Sparrman <daniel.sparrman AT exist DOT se> wrote:
>
> Hi Bob
>
> Considering the amount of data, I would presume that it's not only a few
> files being backed up, but a huge amount since most systems out there cant
> hold a few files for a total of 50PB of data.
>
> You will also have to consider what impact storing 50PB of data will have
> on the TSM database. We recently hit the max size of the database at a
> customer location which is 524 GB (don't remember the exact size, but you'll
> notice when TSM says you've reached the maximum size ;))
>
> Considering this, you'll probably be using more than 1 instance, thus
> spreading out both data and database load on multiple servers.
>
> Also, cartridge handling will be a problem since storing petabytes of data
> will require a lot of cartridges, thus creating a need for fast mounts /
> dismounts and a pretty large number of tape drives, both for backing up /
> restoring data but also to keep the internal TSM housecleaning running
> smoothly.
>
> Another way of going at it is looking at enterprise VTL technology thus
> eliminating the need for cartridge handling and the requirements for a
> numerous amount of tape drives. As far as I know though, there is no VTL
> that can store that amount of data in a single system without deduplication.
>
> That kinda brings up another question; what are the customer's expectation
> / regulations when it comes to disaster recovery?
>
> It would be easier to actually point out the issues involved if some of
> the questions you already came up with would've been answered ;)
>
> Best Regards
>
> Daniel Sparrman
>
> -----Ursprungligt meddelande-----
> Från: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] För Bob 
> Talda
> Skickat: den 23 januari 2008 21:16
> Till: ADSM-L AT VM.MARIST DOT EDU
> Ämne: Seeking thoughts/experiences on backing up large amounts (say 50
> Petabytes) of data
>
> Folks:
>   Our group has been approached by a customer who asked if we could
> backup/archive 50 petabytes of data.  And yes, they are serious.
>
>   We've begun building questions for the customer, but as this is roughly
> 1000 times the current amount of data we backup, we are
> on unfamiliar turf here.
>
> At a high level, here are some of the questions we are asking:
> 1) Is the 50 Petabytes an initial, or envisioned data size?  If
> envisioned, how big is the initial data load and how fast will it grow?
> 2) What makes up the data: databases, video/audio files, other?
> (subtext: how many objects are involved?  What are the
> opportunities to compress/deduplicate?)
> 3) how is the data distributed - over a number of systems or from a
> supercluster?
> 4) Is the data static, or changing slowly or changing rapidly? (subtext:
> is it a backup or archive scenario)
> 5) What are the security requirments?
> 6) What are the restore (aka RTO) requirements?
>
>   We are planning on approaching vendors to get some sense of the probable
> data center requirements (cooling, power, footprint).
>
>   If anyone in the community has experience with managing petatybes of
> backup data, we'd appreciate any feedback we could incorporate.
>
>   Thanks in advance!
>
CONFIDENTIALITY NOTICE:  This email and any attachments are for the 
exclusive and confidential use of the intended recipient.  If you are not
the intended recipient, please do not read, distribute or take action in 
reliance upon this message. If you have received this in error, please 
notify us immediately by return email and promptly delete this message 
and its attachments from your computer system. We do not waive  
attorney-client or work product privilege by the transmission of this
message.