ADSM-L

[ADSM-L] Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data

2008-01-23 15:16:28
Subject: [ADSM-L] Seeking thoughts/experiences on backing up large amounts (say 50 Petabytes) of data
From: Bob Talda <rpt4 AT CORNELL DOT EDU>
To: ADSM-L AT VM.MARIST DOT EDU
Date: Wed, 23 Jan 2008 15:16:10 -0500
Folks:
  Our group has been approached by a customer who asked if we could 
backup/archive 50 petabytes of data.  And yes, they are serious.

  We've begun building questions for the customer, but as this is roughly 1000 
times the current amount of data we backup, we are
on unfamiliar turf here.

 At a high level, here are some of the questions we are asking:
1) Is the 50 Petabytes an initial, or envisioned data size?  If envisioned, how 
big is the initial data load and how fast will it grow?
2) What makes up the data: databases, video/audio files, other?   (subtext: how 
many objects are involved?  What are the
opportunities to compress/deduplicate?)
3) how is the data distributed - over a number of systems or from a 
supercluster?
4) Is the data static, or changing slowly or changing rapidly? (subtext: is it 
a backup or archive scenario)
5) What are the security requirments?
6) What are the restore (aka RTO) requirements?

  We are planning on approaching vendors to get some sense of the probable data 
center requirements (cooling, power, footprint).

  If anyone in the community has experience with managing petatybes of backup 
data, we'd appreciate any feedback we could incorporate.

  Thanks in advance!