Networker

Re: [Networker] saveset sizes?

2008-08-13 04:10:02
Subject: Re: [Networker] saveset sizes?
From: Preston de Guise <enterprise.backup AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Wed, 13 Aug 2008 18:05:56 +1000
Hi Jeff,

We are currently backing up a windows file server with 5 drives most
of these are over a 1TB 2 are 3TB and 1 is 7TB, obviously these take a
long time to backup.

So I have some questions

1. Is there a maximum recommended backup size?

I'm not aware of the limit for a saveset size in NetWorker. I imagine it would be in the petabytes at least.

2 As I have 5 tape drives is there a way to allow the one saveset f:\
for example to stream to multiple tapes?

Not directly. We're still waiting for multistreaming an individual saveset to be perfected. (It's been toyed with at least once to my knowledge, but there were recovery issues so it never became available.)

3. As we don't control the filesystem layout does anyone have a cool
example of how to split up the drive into separate directory's
F:\dir1, f:\dir2 etc and then run a catchall f:\ with skip statments
for F:\dir1, F:\dir2  etc???

In the past when I've approached this, I've considered using heuristics where I first work out the size of various directories, but the time taken to work out such sizes negates a lot of the effort. I also don't think calculating previous backup sizes is all that of an option, unless anyone can point out a reliable mechanism to have NetWorker report the size of a subset of a saveset, and I'm not aware of any mechanism to do so.

Instead when I've done this, I've relied on auto-building client definitions at "good" directory points.

For instance, when I had a customer that had very, very dense fileserver filesystems (e.g., 400GB to start with, but easily 40,000,000+ files at that size), the filesystems were structured along the lines of:

X:\Department\Dept A
X:\Department\Dept B
X:\Department\Dept C
etc
X:\HomeDirs\usera
X:\HomeDirs\userb
X:\HomeDirs\userc
etc
X:\otherdir1
X:\otherdir2
X:\otherdir3

In this scenario, I had 1 client definition that backed up those minor, "otherdir" directories.

For the departmental and user directories, I had a custom backup command that constructed a directory listing of each, and populated client instances set for higher levels of parallelism (8) with a list of the individual "master" directories - i.e., you'd get client instances with savesets of say:

X:\Department\Dept A
X:\Department\Dept B
X:\Department\Dept C
etc

These were auto-populated each time the backup was run, so there was no risk of missing new directories.

One thing to note in this strategy - NetWorker has limits to the number of savesets, or the size of the client saveset field --- I usually found that to be on the safe side I limited the number of savesets for a client to around 250, based on that relatively flat initial directory structure outlined above. If you do the break-up further down in the directory structure, your mileage would vary. Obviously that meant having (potentially) multiple client definitions and having a modicum of intelligence to populate/refresh client definitions. A good understanding of nsradmin is reasonably essential in this process.

Obviously the downside of this is that your backups are written with much higher levels of multiplexing; however, if you're having problems streaming media due to density of filesystems, this can be a solution if block level backups can't be used. For the case of the customer where I did this, block level backup for NetWorker was still very immature (e.g., couldn't do complete filesystem recovery across a tape boundary! - that's been fixed at least), but because these were dense and highly active fileserver filesystems, it also meant that the filesystems were quite fragmented. Doing file level recoveries from block level backups using cache rebuilds and tape scans was prohibitively slow and the customer used a series of array level replication options, so it was decided to go with massive multiplexing to tape at higher streaming speeds - i.e., using the option above.

4: Does anyone have any other great ideas I should be think of???

One very, very important piece of advice.

DON'T USE SKIP for this style of backup - use the NULL asm instead.

The reason for this is very important. Skip will not only skip a file/ directory during backup, but it will also reflect that in the index. On the other hand, null does not.

What that means is that if you skip directories one night, you'll need to change your browse time in order to find them for recovery. If you use null to not backup directories one night, you'll at all times see all the directories that have been backed up. This makes recoveries a lot easier - i.e., no guessing about what days what savesets were backed up, and more importantly, being able to run one recovery rather than multiple recoveries.

Good luck!

Cheers,

Preston.

--
Preston de Guise


"Enterprise Systems Backup and Recovery: A Corporate Insurance Policy", due out September 17 2008:

http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=AU6396&isbn=9781420076394&parent_id=&pc=

http://www.enterprisesystemsbackup.com

To sign off this list, send email to listserv AT listserv.temple DOT edu and type 
"signoff networker" in the body of the email. Please write to networker-request 
AT listserv.temple DOT edu if you have any problems with this list. You can access the 
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>