Re: [Networker] Backups of Large clients 1+ TBs
2008-04-03 13:07:09
On 01 Apr 08, at 1509, sunman1 wrote:
What is a standard configuration with NetWorker to backup a system
with a large amount of data? We have systems with 600-1000MB file
systems, and MSSQL systems with 2-4TBs of data.
I'm backing several systems each with in excess of 20TB of data. I
back up around 2TB of incrementals per day, as well.
The main things are (a) you can't go faster than your tape drives (b)
you can't go faster than your networking (c) you can't go faster than
your disks and (d) you can't go faster than the CPUs in the client,
the NSR server and the storage node.
The incrementals (ie level 1..9 or inc) all go to staging disk.
Because an incremental spends more of its time considering what to
back up than it does backing up (depending on your ratio of changed to
non-changed files) they behave terribly when sent to tape. I run a
separate `save' process on each file system, making sure that the
parallelism level is at or below the number of CPU threads the client
can cope with and hoping that I have enough disk bandwidth that I
don't need to manage the parallelism there.
When I come to drop those savesets to tape the effective parallelism
is one: ie I can run dozens of incremental streams into a disk staging
area without worrying about how slow or fast they are, and then during
production hours the following day I can take them to tape with
nsrstage or nsrclone at full speed with little resource consumption.
The baselines go straight to tape. I used to do funky things
involving changing the parallelism level to a lower level during
baselines but now I don't bother: the limiting factor for me is the
tapes (LTO3) and the networking (GigE over 20km), and I can pull ~70MB/
sec out of the disk array and on to tape at pretty well any
parallelism level. The main benefit of parallelism=1 is that
recovering a single filesystem will be faster and involve fewer tapes,
but that's something you can sort out during a cloning phase if you
are concerned about it.
If you have one filesystem per set of tapes you can also parallelise
over multiple tape drives during recovery, but these days few disk
arrays will be able to write at the speed of multiple streams anyway.
Weaving together the contents of multiple arrays on one tape might be
a false economy, though.
I found some years ago that having the nsr server and the storage node
on distinct systems was a performance benefit, presumably because the
database updates were kept away from the task of throwing data to
tape. I suspect that's less true today.
ian
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|
|
|