Len Philpot wrote:
We've just implemented 7.5.1 in our first production environment, which is
magnitudes larger than the test environment before it. We have 11 roughly
500 GB disk volumes we're backing up to, knowing that at some point we
will probably have to stage to tape (and the pools, tapes, policies, etc.
are all in place for that). Right now, three of those volumes are about
20-25% used and that's it.
However, the *default* Max Storage Period on our staging policy is 7 days,
which is waaaayyy too soon, causing our first test backups to already
stage to tape. I know what the values all mean, but I've not yet spent a
lot of time in thought as to what the best combination of max storage
period, recover space interval and file system check interval is. But,
blindly staging after only 7 days is clearly much too soon. We'd rather
have a reasonable high water mark trigger staging than time.
Question: Given that we currently have the adv_file space available and
don't need yet to stage, are there any hidden gotchas we'll run into if we
push the max storage period to, say, a month, and just keep an eye on disk
volume usage?
Any advice / tips / traps?
Our environment has 6Tb of AFTD (on four devices). The problems I am
aware of are:
. Setting static high and low watermarks is problematic. It means
that you will start staging during backup (because this is the only way
to go above the high water mark). This will cause your disks to do both
reads and writes at the same time which will cause everything to run
slower. When I first started using AFTD I had a script which was
changing the watermarks via cron, so as to avoid this problem:
30 15 * * * /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault
stage' 95 96
0 5 * * * /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault stage'
90 91
1 6 * * 3 /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault stage'
75 77
1 6 * * 4 /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault stage'
60 62
1 6 * * 5 /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault stage'
45 47
30 9 * * 6 /usr/local/TAUSRC/Local/ToolBox/fixstage.pl 'TAUDefault
stage' 80 82
# cat /usr/local/TAUSRC/Local/ToolBox/fixstage.pl
#!/bin/perl
open(OUT,">/tmp/nsradmin.in.fixstage");
print OUT ". name: $ARGV[0] ; \n\n";
print OUT "\nupdate high water mark (%): $ARGV[2] ; \n";
print OUT "low water mark (%): $ARGV[1] ; \n";
print OUT ("\nquit\n");
close(OUT);
system ("nsradmin -i /tmp/nsradmin.in.fixstage");
sleep 5;
. While the above can work for some systems, it is not optimal
because staging calculates what it needs to do in a single run and you
end up with a huge and long staging process (I've seen a 1Tb staging
processes at times). This means that the system might be staging for
many hours. This causes two problems: You cannot recover while staging.
You don't reclaim the staged space until the entire job is done (and
then there is a hard coded sleep of about two seconds for each staged
save set). In order to solve this I now have a script that stages only
~15Gb and then sleeps for a minute. This frees space earlier and also
allows recoveries to run within a reasonable time frame. The script also
checks when it runs and has different target watermark for time of day
and day of week.
. Besides the above, staging can be done to a single drive (because
the an AFTD device can only be accessed once), so if you need to stage
1Tb, you are limited to the bandwidth of your tape drive. While this
might be OK for some systems, it might be desirable to run concurrent
stage operations.
. Another feature which might be useful is delayed staging, which
will let you copy savesets to tape now, keep them on disk and delete
them later when the disk fills up. Although this can be scripted with
nsrclone, it is a bit more complicated. Also the nsrclone requires a
clone pool (and not a regular pool).
Thanks!
--
Len Philpot
Cleco IT Network Services, PGO3 - ext 7167
(318) 484-7167
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
--
-- Yaron.
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|