Hi Bob
Considering the amount of data, I would presume that it's not only a few files
being backed up, but a huge amount since most systems out there cant hold a few
files for a total of 50PB of data.
You will also have to consider what impact storing 50PB of data will have on
the TSM database. We recently hit the max size of the database at a customer
location which is 524 GB (don't remember the exact size, but you'll notice when
TSM says you've reached the maximum size ;))
Considering this, you'll probably be using more than 1 instance, thus spreading
out both data and database load on multiple servers.
Also, cartridge handling will be a problem since storing petabytes of data will
require a lot of cartridges, thus creating a need for fast mounts / dismounts
and a pretty large number of tape drives, both for backing up / restoring data
but also to keep the internal TSM housecleaning running smoothly.
Another way of going at it is looking at enterprise VTL technology thus
eliminating the need for cartridge handling and the requirements for a numerous
amount of tape drives. As far as I know though, there is no VTL that can store
that amount of data in a single system without deduplication.
That kinda brings up another question; what are the customer's expectation /
regulations when it comes to disaster recovery?
It would be easier to actually point out the issues involved if some of the
questions you already came up with would've been answered ;)
Best Regards
Daniel Sparrman
-----Ursprungligt meddelande-----
Från: ADSM: Dist Stor Manager [mailto:ADSM-L AT VM.MARIST DOT EDU] För Bob Talda
Skickat: den 23 januari 2008 21:16
Till: ADSM-L AT VM.MARIST DOT EDU
Ämne: Seeking thoughts/experiences on backing up large amounts (say 50
Petabytes) of data
Folks:
Our group has been approached by a customer who asked if we could
backup/archive 50 petabytes of data. And yes, they are serious.
We've begun building questions for the customer, but as this is roughly 1000
times the current amount of data we backup, we are
on unfamiliar turf here.
At a high level, here are some of the questions we are asking:
1) Is the 50 Petabytes an initial, or envisioned data size? If envisioned, how
big is the initial data load and how fast will it grow?
2) What makes up the data: databases, video/audio files, other? (subtext: how
many objects are involved? What are the
opportunities to compress/deduplicate?)
3) how is the data distributed - over a number of systems or from a
supercluster?
4) Is the data static, or changing slowly or changing rapidly? (subtext: is it
a backup or archive scenario)
5) What are the security requirments?
6) What are the restore (aka RTO) requirements?
We are planning on approaching vendors to get some sense of the probable
data center requirements (cooling, power, footprint).
If anyone in the community has experience with managing petatybes of backup
data, we'd appreciate any feedback we could incorporate.
Thanks in advance!
|