Hi Jeff,
We are currently backing up a windows file server with 5 drives most
of these are over a 1TB 2 are 3TB and 1 is 7TB, obviously these take a
long time to backup.
So I have some questions
1. Is there a maximum recommended backup size?
I'm not aware of the limit for a saveset size in NetWorker. I imagine
it would be in the petabytes at least.
2 As I have 5 tape drives is there a way to allow the one saveset f:\
for example to stream to multiple tapes?
Not directly. We're still waiting for multistreaming an individual
saveset to be perfected. (It's been toyed with at least once to my
knowledge, but there were recovery issues so it never became available.)
3. As we don't control the filesystem layout does anyone have a cool
example of how to split up the drive into separate directory's
F:\dir1, f:\dir2 etc and then run a catchall f:\ with skip statments
for F:\dir1, F:\dir2 etc???
In the past when I've approached this, I've considered using
heuristics where I first work out the size of various directories, but
the time taken to work out such sizes negates a lot of the effort. I
also don't think calculating previous backup sizes is all that of an
option, unless anyone can point out a reliable mechanism to have
NetWorker report the size of a subset of a saveset, and I'm not aware
of any mechanism to do so.
Instead when I've done this, I've relied on auto-building client
definitions at "good" directory points.
For instance, when I had a customer that had very, very dense
fileserver filesystems (e.g., 400GB to start with, but easily
40,000,000+ files at that size), the filesystems were structured along
the lines of:
X:\Department\Dept A
X:\Department\Dept B
X:\Department\Dept C
etc
X:\HomeDirs\usera
X:\HomeDirs\userb
X:\HomeDirs\userc
etc
X:\otherdir1
X:\otherdir2
X:\otherdir3
In this scenario, I had 1 client definition that backed up those
minor, "otherdir" directories.
For the departmental and user directories, I had a custom backup
command that constructed a directory listing of each, and populated
client instances set for higher levels of parallelism (8) with a list
of the individual "master" directories - i.e., you'd get client
instances with savesets of say:
X:\Department\Dept A
X:\Department\Dept B
X:\Department\Dept C
etc
These were auto-populated each time the backup was run, so there was
no risk of missing new directories.
One thing to note in this strategy - NetWorker has limits to the
number of savesets, or the size of the client saveset field --- I
usually found that to be on the safe side I limited the number of
savesets for a client to around 250, based on that relatively flat
initial directory structure outlined above. If you do the break-up
further down in the directory structure, your mileage would vary.
Obviously that meant having (potentially) multiple client definitions
and having a modicum of intelligence to populate/refresh client
definitions. A good understanding of nsradmin is reasonably essential
in this process.
Obviously the downside of this is that your backups are written with
much higher levels of multiplexing; however, if you're having problems
streaming media due to density of filesystems, this can be a solution
if block level backups can't be used. For the case of the customer
where I did this, block level backup for NetWorker was still very
immature (e.g., couldn't do complete filesystem recovery across a tape
boundary! - that's been fixed at least), but because these were dense
and highly active fileserver filesystems, it also meant that the
filesystems were quite fragmented. Doing file level recoveries from
block level backups using cache rebuilds and tape scans was
prohibitively slow and the customer used a series of array level
replication options, so it was decided to go with massive multiplexing
to tape at higher streaming speeds - i.e., using the option above.
4: Does anyone have any other great ideas I should be think of???
One very, very important piece of advice.
DON'T USE SKIP for this style of backup - use the NULL asm instead.
The reason for this is very important. Skip will not only skip a file/
directory during backup, but it will also reflect that in the index.
On the other hand, null does not.
What that means is that if you skip directories one night, you'll need
to change your browse time in order to find them for recovery. If you
use null to not backup directories one night, you'll at all times see
all the directories that have been backed up. This makes recoveries a
lot easier - i.e., no guessing about what days what savesets were
backed up, and more importantly, being able to run one recovery rather
than multiple recoveries.
Good luck!
Cheers,
Preston.
--
Preston de Guise
"Enterprise Systems Backup and Recovery: A Corporate Insurance
Policy", due out September 17 2008:
http://www.crcpress.com/shopping_cart/products/product_detail.asp?sku=AU6396&isbn=9781420076394&parent_id=&pc=
http://www.enterprisesystemsbackup.com
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|