For many years, I've been doing disk-to-disk-to-tape across data centers like
this:
Primary Data Center:
- Most backup clients
- NetWorker server
- LTO-4 Tape library
Secondary Data Center:
- Cheap Fibre-to-SATA SAN, holding adv_file devices
- Cold standby NetWorker VM
- Expensive SAN with partial replica of primary SAN (certain critical data can
be restored without NetWorker)
The NetWorker server has fibre channel connectivity to the disks in the
secondary data center. All backups fly across FC to the secondary data center,
and are thus immediately offsite. Scripts clone the savesets to the tape
library in the primary data center. (It's a lot easier to move photons than
tape cartridges.)
So the adv_file devices serve several purposes:
1) Staging, ensuring that we have the bandwidth to feed LTO-4 tape drives
2) Quick recovery of recently backups (no tape load/position delay)
3) Disaster recovery
I do not need or want the adv_file devices to fill up. I don't think I want
NetWorker to manage the disk usage automatically, because different clients
have different schedules, and one saveset might need to stay for 14 days but
another 7 days. My script for pruning savesets from the DR disks has been
pretty simple-minded, essentially:
for i in `/usr/sbin/mminfo -r ssid,cloneid -q
'copies>2,family=disk,group=ThisGroup,sscreate<16 day ago'|grep -v 'clone
id'|perl -pe 's/ +/\//'`; do /usr/sbin/nsrmm -y -d -S $i 2>&1; done | egrep -v
'save set .* does not exist'
... and again for ThatGroup,sscreate<9 day ago, etc.
But this doesn't clean up incrementals older than the most recent full; nor
does it handle clone or backup errors. And now I'm at the point of wanting to
save as much adv_file space as possible to avoid having to pay for another disk
backup tier.
At first blush, sorting `mminfo -v -q 'family=disk,copies>2,!incomplete'` by
client/name/level, noting the most recent successful full, and deleting
anything older, should do. But this doesn't actually verify that the clone to
tape succeeded (or does it?).
Does anyone have such a script already debugged? Or alternative advice? I kinda
want the opposite of nsrim -l, which expires the oldest full saveset and its
children. I want to purge everything except the newest full, and its children.
Now that NetWorker allows primary and clone copies to have different retentions
policies, can I just set my primary copy retention to (time between fulls), and
extend the copies with nsrclone -wy? This makes me nervous, because if my
clones to tape are ever broken for more than 8 days, I start losing data.
--
Rich Graves http://claimid.com/rcgraves
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with this
list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|