Networker

Re: [Networker] Long NDMP Backups

2008-02-27 10:07:45
Subject: Re: [Networker] Long NDMP Backups
From: Matthew Huff <mhuff AT OX DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 27 Feb 2008 10:00:35 -0500
Oops,

 

Yes, 400GB. 

 

I'm not sure that attaching it directly for restores would be faster, and I'm 
not sure it's even possible since during DSA the data is streamed into other 
native legato format so it can be parallelized with other backups and use 
variable block size.

 

Here are the changes we made to improve tcp performance with DSA backups (and 
would probably help with any large GB Ethernet backups).

 

For solaris (but Linux has similar tweaks)

 

There is a socket option to do "window scaling", which is a big improvement 
with large pipes. From our testing in 7.1,7.2 Legato doesn't set the socket 
option (I haven't tested 7.3/7.4). However you can force the OS to make all 
sockets turn on window scaling (and put timestamps to prevent sequence number 
wraps)

 

ndd -set /dev/tcp tcp_tstamp_always 1

ndd -set /dev/tcp tcp_wscale_always 1

 

The default TCP buffers in Solaris 8/9 are too small. Solaris 10 autotunes 
better, but from what I've found, they are still too low with large fat pipes 
streaming one-way data like backups do

 

ndd -set /dev/tcp tcp_max_buf    4194304

ndd -set /dev/tcp tcp_cwnd_max   2097152

ndd -set /dev/tcp tcp_xmit_hiwat 1048576

ndd -set /dev/tcp tcp_recv_hiwat 1048576

 

By setting the buffers > 65536 you also trigger sliding windows

 

Take a look at http://www.faqs.org/rfcs/rfc1323.html

 

Most of the tuning is for LFN networks (Large Fat Networks), with considerable 
delay. However, even in LAN with gigabit locally (we have a 10GB Ethernet core 
with 4 x Cisco 6509 with Sup 720 engines and gig-e to the servers), the tuning 
can help considerably, especially in reducing the CPU interrupts on the servers.

----
Matthew Huff       | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
www.otaotr.com <http://www.otaotr.com>      | Phone: 914-460-4039
aim: matthewbhuff  | Fax:   914-460-4139

From: Fazil.Saiyed AT anixter DOT com [mailto:Fazil.Saiyed AT anixter DOT com] 
Sent: Wednesday, February 27, 2008 9:42 AM
To: EMC NetWorker discussion; Matthew Huff
Cc: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Long NDMP Backups

 


Hello, 
400 Mb would be greatly limiting factor, did you mean 400 GB, I agree with 
limiting the savesets to match the filers performance and tune it gradually. 
I would like the info on any TCP tuning that you have done, also, since DSA 
recovery are slow, could you not use a directly attached Tape drive for any 
restores ( atlest during DR). 
Thanks 



Matthew Huff <mhuff AT OX DOT COM> 
Sent by: EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU> 

02/26/2008 03:46 PM 

Please respond to
EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU>; Please respond 
to
Matthew Huff <mhuff AT OX DOT COM>

To

NETWORKER AT LISTSERV.TEMPLE DOT EDU 

cc

        
Subject

Re: [Networker] Long NDMP Backups

 

                




I'm glad you are sure, because we are doing it right now and it's working well. 
Of course, we worked with Netapp engineering and this was their suggestion. 
They strongly suggest not having any saveset over 400MB, especially DAR 
restores can be extremely slow with very large savesets.

Obviously every filer is different. 10 may be too much, 5 may be just right, 
however doing everything in serial with just one saveset is going to be a major 
bottleneck. Tuning your network by using jumbo frames and making sure that the 
tcp sliding window is tuned is very important

BTW, I've been doing NDMP backups with Netapp since before Legato supported it, 
so I've got a bit of experience. We are currently backing up around 4TB. We are 
using about 6 savesets per filer. The full backups are taking around 6 hours, 
except for one saveset that is taking around 11 (we are currently migrating 
data around to break up the saveset). Once the migration is done, we should be 
back to around 6 hours. BTW, we have been forced to kick the backup off earlier 
than normal on a Friday (due to major power work being done over a weekend) and 
even with all the backups running, the system never had any issues even during 
the trading day.



----
Matthew Huff       | One Manhattanville Rd
OTA Management LLC | Purchase, NY 10577
www.otaotr.com     | Phone: 914-460-4039
aim: matthewbhuff  | Fax:   914-460-4139

-----Original Message-----
From: Yaron Zabary [mailto:yaron AT aristo.tau.ac DOT il] 
Sent: Tuesday, February 26, 2008 4:10 PM
To: EMC NetWorker discussion; Matthew Huff
Subject: Re: [Networker] Long NDMP Backups

Matthew Huff wrote:
> The main advantage is that it runs in parallel rather than in serial. For 
> example, lets say your /vol/vol0 was 1TB, and had 10 qtrees each with 100MB 
> in it. You could increase the client parallelism in legato to 10, and when 
> you started the backup with a saveset of:
>  
> /vol/vol0/dir_a
> /vol/vol0/dir_b
> /vol/vol0/dir_c
> /vol/vol0/dir_d
> /vol/vol0/dir_e
> /vol/vol0/dir_f
> /vol/vol0/dir_g
> /vol/vol0/dir_h
> /vol/vol0/dir_i
> /vol/vol0/dir_j
>  
> You would get 10 parallel backups each taking around 1/10 of what the volume 
> backup would take. If you had the I/O and tape drive capacity, you would be 
> reducing your backup time by 90%. Of course, that's an ideal situation.
> 

  I am quite sure that this is a great way of killing your filer. Our 
3050 can push at LTO-3 (~70MB/s) speed while consuming many CPU cycles 
(our CPU graphs are broken, so I cannot provide real numbers, but 20% 
seems about right). Considering this, running too many NDMP backups at 
once will make the filer unresponsive (assuming that it does any useful 
work, this might be unacceptable). It would not even get things to work 
any faster because if the filer is at 100% CPU utilization, it will 
become your bottleneck (it could even get you worse performance, as you 
will most likely have contention on your aggregate, volume or RAID group.

-- 

-- Yaron.

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER




To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER