ADSM-L

[ADSM-L] SV: Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data

2008-01-24 10:52:21
Subject: [ADSM-L] SV: Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data
From: Daniel Sparrman <daniel.sparrman AT EXIST DOT SE>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Thu, 24 Jan 2008 16:50:42 +0100
Hi Bob

Considering the amount of data, I would presume that it's not only a few files 
being backed up, but a huge amount since most systems out there cant hold a few 
files for a total of 50PB of data.

You will also have to consider what impact storing 50PB of data will have on 
the TSM database. We recently hit the max size of the database at a customer 
location which is 524 GB (don't remember the exact size, but you'll notice when 
TSM says you've reached the maximum size ;))

Considering this, you'll probably be using more than 1 instance, thus spreading 
out both data and database load on multiple servers.

Also, cartridge handling will be a problem since storing petabytes of data will 
require a lot of cartridges, thus creating a need for fast mounts / dismounts 
and a pretty large number of tape drives, both for backing up / restoring data 
but also to keep the internal TSM housecleaning running smoothly. 

Another way of going at it is looking at enterprise VTL technology thus 
eliminating the need for cartridge handling and the requirements for a numerous 
amount of tape drives. As far as I know though, there is no VTL that can store 
that amount of data in a single system without deduplication.

That kinda brings up another question; what are the customer's expectation / 
regulations when it comes to disaster recovery? 

It would be easier to actually point out the issues involved if some of the 
questions you already came up with would've been answered ;)

Best Regards

Daniel Sparrman

-----Ursprungligt meddelande-----
Från: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] För Bob Talda
Skickat: den 23 januari 2008 21:16
Till: ADSM-L AT VM.MARIST DOT EDU
Ämne: Seeking thoughts/experiences on backing up large amounts (say 50 
Petabytes) of data

Folks:
   Our group has been approached by a customer who asked if we could 
backup/archive 50 petabytes of data.  And yes, they are serious.

   We've begun building questions for the customer, but as this is roughly 1000 
times the current amount of data we backup, we are
on unfamiliar turf here.

  At a high level, here are some of the questions we are asking:
1) Is the 50 Petabytes an initial, or envisioned data size?  If envisioned, how 
big is the initial data load and how fast will it grow?
2) What makes up the data: databases, video/audio files, other?   (subtext: how 
many objects are involved?  What are the
opportunities to compress/deduplicate?)
3) how is the data distributed - over a number of systems or from a 
supercluster?
4) Is the data static, or changing slowly or changing rapidly? (subtext: is it 
a backup or archive scenario)
5) What are the security requirments?
6) What are the restore (aka RTO) requirements?

   We are planning on approaching vendors to get some sense of the probable 
data center requirements (cooling, power, footprint).

   If anyone in the community has experience with managing petatybes of backup 
data, we'd appreciate any feedback we could incorporate.

   Thanks in advance!