Networker

Re: [Networker] Advanced file type devices on storage nodes

2008-05-26 06:14:15
Subject: Re: [Networker] Advanced file type devices on storage nodes
From: Davina Treiber <Davina.Treiber AT PEEVRO.CO DOT UK>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Mon, 26 May 2008 11:07:43 +0100
Teresa,

I know EXACTLY how you are thinking here, and I went through the same
process a couple of years ago. As you say - NetWorker knows internally
because of the staging, but it doesn't make the information available.
It is frustrating that the staging mechanism doesn't allow you to
substitute another action in place of staging.

This is what I wrote when I devised a scripted system to achieve what
you are talking about:


Rationale behind disk housekeeping scripts.
-------------------------------------------
When I implemented adv_file devices for a customer, I couldn't see how
NetWorker "out-of-the-box" could deliver exactly what I was looking for.
This implementation would provide a 1.5 TB disk device on EMC Clariion
disks for use with Oracle Archive logs. The hardware was performance
tested and gave astonishing performance, with calculated throughput on a
bigasm backup of 155MB/s. However, the 1.5 TB was not going to be
sufficient to store the data for its entire retention period, so staging
was going to be necessary.

I wanted to keep the most recent logs on disk for as long as possible,
in order to give the most benefit for restores. Thus I wished to keep
the disk device fairly full at all times. At the same time I needed to
have the data copied to tape as soon as possible for reasons of data
security and for off-siting.

Cloning the data was easy enough, but staging it to recover space was
not acceptable for two reasons:
(1) I would already have cloned the data to tape soon after it was first
backed up (twice in this case), so it is a little late in the day to be
copying it to tape, and
(2) Staging may take quite some time, since it needs to write to tape,
whereas if backups are waiting for space we require an almost instant
method of clearing space on the disk device.

My approach was that I would not use staging to clear space, instead I
would use nsrmm to delete individual save set clones when they reach a
specified age. This would be safe because prior to deletion I would
check that the save set has other clones on tape. Following deletion the
space can be recovered using nsrstage -C.

Overview of scripts
-------------------
The housekeepdiskdev script does both cloning and the deletion of save
sets when they reach a specified age. The cloning in this case is done
to two pools, the script would need some changes to write to just a
single pool. There is a mode (-r option) which causes the script not to
do any cloning, just trimming of older save sets, this is provided for
use at times when tape drives are expected to be busy such as during the
night. This script is designed to be scheduled from cron several times a
day.

I also thought it necessary to write a script that would run to clear
space when the disk device becomes full, this script is called
diskdevsafetyvalve. It is designed to be called from the "Filesystem
full - recover adv_file space" notification. It can be configured to
remove a fixed number of save sets, or to remove a percentage of save
sets based on size.

Implementation notes
--------------------
NetWorker normally provides pretty good hooks for scripting, but in the
case of the adv_file devices it is lacking. To the best of my knowledge
it is impossible to find out from NetWorker how much space is used on an
adv_file device. mminfo yields no useful information here unless you add
up the space used for all the save sets on the volume. The only other
way is to run df or bdf to query the space used on the filesystem, and
this creates complications if the adv_file device is on a storage node
(as it was for my customer). I was left with the limitation of running
the safety valve only when the disk becomes full, whereas I would have
preferred to run it at a percentage (99%?) of capacity (the NSR stage
resource can do this but I don't know how it can be done from a command).

The "Filesystem full - recover adv_file space" notification is also
lacking. The information contained is ambiguous, it gives the name of
the filesystem that is full but does not tell what storage node the
filesystem resides on. Thus my script has a limitation that adv_file
devices have to reside on uniquely named volumes throughout the data
zone. What would be desirable would be the name of the NetWorker volume
rather than the filesystem, and my script includes some contorted code
to do this conversion.

The NSR stage resource might have been useful, if only it had included
the facility to run a user script rather than just staging save sets.

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER