This is something I considered greatly when designing our current backup
system. Veritas puts out a book called "The Resilient Enterprise" which
of course does a wonderful job of promoting their products, but also
takes a look at case studies where these issues were either ignored or
anticipated. The first few chapters cover the New York Board of Trade
and their offices in the WTC. They took the time to build a backup site
and when the toweres crushed their datacenter and trading floor, they
were thankful they prepared for the worst, and realized how they should
have made it bigger. Another company case study bought a 40 drive tape
library to augment their restore times. I asked our Veritas rep for
more information about this company, but he never got back to me. I'd
like to know some information about the environment that could justify
an additional 40 tape drives. That's expensive!
We can successfully do high speed backups on 4 drives, and I tried to
buy a configuration that could support us for at least 1 or 2 years
without having to buy more hardware. I wanted enough drive speed to
backup our whole datacenter in 4 hours, and then I added an extra
drive. This is how I came to 8 LTO drives and two media servers, and
four gigabit lines. We once did a restore of over a million files, and
because it was so many small files writing to an NT4 box, it took 13
hours. For the more mission critical systems in our environment, there
are less files, but they are huge. Oracle DBF files are big, and stream
nicely both on and off tape. Our mission critical requirements can
easily be met if we had to recover a solaris/oracle database from a
multiplexed tape. If we had to recover the user data from our NAS
server, it would take all week. We do our best to buy and build fault
tolerant hardware for those systems.
I routinely test our restores to validate our process and prove the
reliability of our tapes. I have worked hard to set restore
expectations to at least twice that of the backup times, and then I pad
my numbers a bit to give me time to setup and debug. Our business would
suffer if we were down for a day while we recovered a database or
restored a filesystem. We occasionally look at the backup load and
backup window to see if we are meeting the needs of the users. It's
easy to get caught up in reducing the backup window, without considering
the restore times. Restore speeds are never mentioned in the sales
brochures for any backup product I've seen.
I think if our data grows beyond our ability to effectively use 8
drives, we'll buy another media server and 4 more drives, so restore
times don't become a problem.
Since LTO's read so fast, I've found that multiplexing=8 gives me an
average restore stream of about 5 to 10 MB/sec, depending on the file
types. On the media server, I sometimes see a tape read speed of 30 - 70
MB/sec.
-Jon
>These are excellent backup times, but in a disaster recovery test have you
>seen the restore times and do they meet the company business critical down
>time? Just wondering because we can create very fast backups and lose sight
>of the amount of restore time per system in the event of losing the data
>center and having to recreate at a disaster recovery site. (business
>critical data only)
>
>Sorry if I got off the original question but this can be significant. Which
>is more important faster backups or restores ?
>Mark Eisenhardt
>
>-----Original Message-----
>From: veritas-bu-admin AT mailman.eng.auburn DOT edu
>[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Jon
>Bousselot
>Sent: Friday, March 07, 2003 11:14 PM
>Cc: 'Karl.Rossing AT Federated DOT CA'; veritas-bu AT mailman.eng.auburn DOT
>edu
>Subject: Re: [Veritas-bu] 4 LTO Tape Drives
>
>I have two gig lines coming into each of my media servers, but one of
>them is the primary connector for backup traffic. All the non-gig
>clients come in on the other line, so with multiplexing at 8, I can keep
>3 drives streaming, and the fourth one averages about 9 MB/sec. This is
>the same on the other media server. As the quick clients drop off
>toward dawn, tape usage drops down to one or two drives, and keeps a
>good stream of 8 to 18 MB/sec. Balancing the number of data streams
>versus system load on our larger servers is more important than backup
>speed.
>Our small file clients are user data NT, and our large file clients are
>Oracle databases on Sun.
>
>Definately split the drives at two per channel. The 64 bit 66MHz PCI
>bus on an E280 does a nice job supplying data to four drives at once.
>
>I did a test duplication between all drives at once, and saw average
>speeds of 15MB/sec on our four drives. Certainly between locally
>attached units, you can really move some data!
>
>Tuning the buffers made all the difference for streaming the 3 drives.
>
>-Jon
>
>
>
>>> LTO has a native speed of 15 MB/sec. I've never seen ours burst past
>>> 25 MB/sec. So your Gigabit ethernet should be able to drive all four
>>> drives at max speed.
>>>
>>> LVD SCSI is only good for 80MB/sec - I'd recommend two busses with two
>>> tape drives each.
>>>
>>> $.02
>>> -M
>>
>>
|