Networker

Re: [Networker] NetWorker + DD Boost vs. NetWorker & Avamar

2012-11-07 16:19:14
Subject: Re: [Networker] NetWorker + DD Boost vs. NetWorker & Avamar
From: Tim Mooney <Tim.Mooney AT NDSU DOT EDU>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 7 Nov 2012 15:17:40 -0600
In regard to: Re: [Networker] NetWorker + DD Boost vs. NetWorker & Avamar,...:

Just to answer a small part of one of your questions in terms of the
difference between NetWorker 8.x + DD Boost and Avamar. Is that DD Boost
is still very chatty when compared to Avamar, which means that Avamar
works a lot better (and I believe is certified) over slowish/high
latency WAN links whereas DD Boost isn't certified, and pretty much
doesn't work in this circumstance.

I hadn't considered that potential difference.

As far as a definition of slowish and
high latency goes you'd need to talk to your EMC rep to find out what
sort of links they will qualify DD Boost over, as an example we have a
50Mb/s link to a site that has a 28ms response time to pings (the site
is about 1800KM's away) - so a reasonably fat pipe, but with a high
latency and a backup to networker with DD Boost (this was networker
7.6.x to DD running DDOS 5.0.x) and the backup ran at a crawl, whereas
Avamar backups work fine, its also possible that performance with
Networker 8 and DDOS 5.1/5.2 have improved its performance over this
type of link

We mostly have 10 Gigabit between our main campus and satellite campus
locations, but since we're a land grant institution we do have things like
regional agricultural extension offices and other outlying state agencies
that we may want to someday back up, so your point about how performant
each product is is very helpful.

As for recovery times, it's going to depend on your situation, if you
are performing single or a small number (this number is going to vary
depending on the model of dedupe appliance that you are considering) of
recoveries concurrently then I'd say that performance would be similar
to traditional disk based backup solutions, and faster than tape based
solutions. But if you are talking about a larger number of concurrent
recoveries then there are going to be a number of variables in your
environment that are going to contribute to the speed of recovery,
backup server / storage node spec's and load, backend network
infrastructure, target client spec's  and load.

I was mainly thinking about a complete restore of a big data volume after
some type of catastrophic issue, where we have to go back to a point in
time before some type of corruption event happened.  I'm much less worried
about one-off and small batch file recoveries, and wasn't really
considering multiple simultaneous complete recoveries (though that's also
a consideration, obviously, it's not specifically what I would aim our
RTO at).

Thanks for the response!  It's been very helpful.

Tim

-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On 
Behalf Of Tim Mooney
Sent: Tuesday, 6 November 2012 8:11 AM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] NetWorker + DD Boost vs. NetWorker & Avamar

All-

We're currently in the throes of re-evaluating the antiquated way we do backups 
(everything is still straight to tape in our environment).

I've been doing quite a bit of reading about Data Domain + Boost and Avamar, 
but even EMC's most recent documentation doesn't seem to be very clear.  There 
was some good info (as usual) on Preston's nsrd.info blog, but I'm still 
confused.

As I understand things:

- prior to NetWorker 7.6.1, the DD integration was minimal, and all
  dedupe happened on the appliance (target), so you still always
  transferred all data over the network.

- at NetWorker 7.6.1 and later, DD + Boost integration allows dedupe to
  happen at the storage node, so traffic between the node and the DD
  box is reduced, but it's still not source dedupe.

- I've since read reports that at NetWorker 8.x, in addition to this
  "client direct" bit that I'm not quite clear on, the NetWorker
  *client* software can actually do the dedupe, so you have true
  source-side dedupe with full NetWorker integration.

Is all of that correct?

If it is, and a NetWorker 8.x + DD Boost environment can do true source dedupe, 
where does Avamar fit?  Is that still better for VMWare source dedupe?  Is the 
NetWorker client dedupe not variable block?  Does only Avamar do global source 
dedupe, and NetWorker+DD Boost is perhaps only per-client dedupe?

If anyone can point me to some good publicly available or Powerlink 
documentation that explains this, it would be much appreciated.

Also, for those of you that are using source dedupe now, I've read reports that although 
the backup window will shrink dramatically after the first full, the restore times may 
actually get worse, as data "rehydration"
takes longer than recovering from a traditional full.  Is that just outdated 
information?

Either way, source dedupe seems to be a fantastic way to shrink the backup 
window, but what strategies are people currently using to also shrink the 
recovery window?  We geographically mirror (at the block level, via Linux 
software raid) our largest SAN volumes on many of our servers, but that doesn't 
protect from file removal or things like filesystem corruption or application 
induced data corruption.  As part of the complete overhaul of how we're doing 
backups, we would like to be able to confidently establish recovery time 
objectives for our big volumes, and I would love to hear how other sites are 
meeting their RTOs on 2+ TB volumes with 5+ million files.

Thanks,

Tim


--
Tim Mooney                                             Tim.Mooney AT ndsu DOT 
edu
Enterprise Computing & Infrastructure                  701-231-1076 (Voice)
Room 242-J6, IACC Building                             701-231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164