Networker

Re: [Networker] recommendation to backup 1.5 million small files?

2008-01-24 15:16:25
Subject: Re: [Networker] recommendation to backup 1.5 million small files?
From: Preston de Guise <enterprise.backup AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Fri, 25 Jan 2008 07:12:13 +1100
On 25/1/08 6:43 AM, "Rob Sterba" <Sterba_Robert AT EMC DOT COM> wrote:

> Yes, SnapImage would be the way to go.  It basically sets up an NDMP
> server on you client allowing you to do block level backups of the
> filesystems.  
> 
> There are a couple caveats, but every time I've set it up we've seen
> quite dramatic improvements.

I believe SnapImage has improved somewhat from when I last looked at it. For
example, the last time I looked at it, it couldn't be used to perform a full
filesystem recovery (i.e., block level recovery) that spanned backup media,
and it couldn't do parallelism.

However, one area where SnapImage will still suffer, and perhaps not really
of its own fault, is recovery.

I had a customer who was taking 15 hours to do backups of 2 x 400GB highly
dense filesystems. Using SnapImage we were able to get the backup down to
less than 3 hours. This was good.

Recovery, on the other hand, was a nightmare and led to SnapImage not being
used.

I'll qualify:

- Complete block level recovery of the backup was amazingly swift - the same
time as the backup (so long as the backup didn't span multiple tapes)
- File level recovery of a single file was quite reasonable.
- Multiple files being recovered sucked for large values of 'suck'.

However, the cause of this was really that the filesystem was a reasonably
fragmented one. It was an NTFS volume that had been in operation for a few
of years and at most it got a defrag whenever it was grown in size, but
usually not even then.

Most filesystems that are in use for longer periods of time end up with some
level of fragmentation and this is what causes block level backup agents
performing file level recovery to have problems.

The reason for this is that to accomplish a file level recovery, the block
level backup agent has to read the blocks associated with a  file into
cache, and then rebuild the file from the cache. If the source filesystem
was somewhat fragmented, this increases the amount of reads to be performed
- which is Not A Good Thing when reading from tape. This meant for instance
that while single files came back OK, and a 400 GB filesystem could be
recovered in around 3 hours using block level recovery, a _40 GB_
subdirectory with a modest number of files (maybe in the order of 10,000
throughout a reasonably normal directory structure) took _12_ hours to
recover.

This, I believe, is a major caveat, and if you are going to be in any
position where recovery time is important and its not possible to do a
complete block level recovery (either back to the original, or relocated),
then you perhaps need to consider alternate backup methods - or ways of
keeping the source filesystem highly defragmented.

(If however, SnapImage now supports backing up to disk, and you can backup
to disk first, this may mitigate the above scenario.)

Cheers,

Preston de Guise.
  
> 
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On
> Behalf Of Wiley, Craig
> Sent: Thursday, January 24, 2008 12:22 PM
<snip> 
> This is funny. We have been working with this exact problem and my
> findings point me to Networker SnapImage Module. I was about to send an
> email asking for some more info on SnapImage.
> 
> I know it requires NDMP and I am trying to figure out it will work with
> are infrastructure.
> 
> I need to back up a LUN from our SUN 9985 array that is attached Windows
> Server 2003 client.
<snip> 
> -----Original Message-----
> From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] 
> On
> Behalf Of mark wragge
> Sent: Thursday, January 24, 2008 1:48 PM
> To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
> Subject: [Networker] recommendation to backup 1.5 million small files?
> 
> Hi, what options does networker have to backup a server with lots of
> small files. Each night the backup will be 1.5 million files on two
> drives on a win2003 server. I know that with netbackup i could look at
> flashsnap backup. Is there a competing module that works with networker?
>    
<snip>
-- 
http://www.anywebdb.com


"Enterprise Systems Backup and Recovery: A Corporate Insurance Policy",
August 2008:

http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=AU6396
&isbn=9781420076394&parent_id=&pc=

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER