Re: [Networker] Backups of Large clients 1+ TBs

Hello,
While you are doing lot of processes right, if you get a chance and have 
the resources you may want to experiment\test with GIGE_Jumbo Frames at 
long distance & snapimage module, compression directive to see if you 
could achieve some more throughput out of this long distance pipelines.
Thanks



Ian G Batten <ian.batten AT UK.FUJITSU DOT COM> 
Sent by: EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU>
04/03/2008 11:54 AM
Please respond to
EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU>; Please respond 
to
Ian G Batten <ian.batten AT UK.FUJITSU DOT COM>


To
NETWORKER AT LISTSERV.TEMPLE DOT EDU
cc

Subject
Re: [Networker] Backups of Large clients 1+ TBs






On 01 Apr 08, at 1509, sunman1 wrote:
> What is a standard configuration with NetWorker to backup a system 
> with a large amount of data?  We have systems with 600-1000MB file 
> systems, and MSSQL systems with 2-4TBs of data.

I'm backing several systems each with in excess of 20TB of data.  I 
back up around 2TB of incrementals per day, as well.

The main things are (a) you can't go faster than your tape drives (b) 
you can't go faster than your networking (c) you can't go faster than 
your disks and (d) you can't go faster than the CPUs in the client, 
the NSR server and the storage node.

The incrementals (ie level 1..9 or inc) all go to staging disk. 
Because an incremental spends more of its time considering what to 
back up than it does backing up (depending on your ratio of changed to 
non-changed files) they behave terribly when sent to tape.  I run a 
separate `save' process on each file system, making sure that the 
parallelism level is at or below the number of CPU threads the client 
can cope with and hoping that I have enough disk bandwidth that I 
don't need to manage the parallelism there.

When I come to drop those savesets to tape the effective parallelism 
is one: ie I can run dozens of incremental streams into a disk staging 
area without worrying about how slow or fast they are, and then during 
production hours the following day I can take them to tape with 
nsrstage or nsrclone at full speed with little resource consumption.

The baselines go straight to tape.  I used to do funky things 
involving changing the parallelism level to a lower level during 
baselines but now I don't bother: the limiting factor for me is the 
tapes (LTO3) and the networking (GigE over 20km), and I can pull ~70MB/ 
sec out of the disk array and on to tape at pretty well any 
parallelism level.  The main benefit of parallelism=1 is that 
recovering a single filesystem will be faster and involve fewer tapes, 
but that's something you can sort out during a cloning phase if you 
are concerned about it.

If you have one filesystem per set of tapes you can also parallelise 
over multiple tape drives during recovery, but these days few disk 
arrays will be able to write at the speed of multiple streams anyway. 
Weaving together the contents of multiple arrays on one tape might be 
a false economy, though.

I found some years ago that having the nsr server and the storage node 
on distinct systems was a performance benefit, presumably because the 
database updates were kept away from the task of throwing data to 
tape.  I suspect that's less true today.

ian

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type 
"signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER



To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER