Hi,
I'm trying to confirm something here in regards to nsrim, and I have a
bunch of questions (yeah, what's new):
The nsrim man page mentions:
nsrim uses policies to determine how to manage online entries. (See
nsr_policy(5), nsr_client(5), and the NetWorker Administrator's Guide
for an explanation of index policies). Entries that have been in an
online file index longer than the period specified by the respective
client's browse policy are removed. Save sets that have existed longer
than the period specified by a client's retention policy are marked as
recyclable in the media index.
Now, assuming that nobody is running 'nsrck' or 'nsrmm' then is the
'nsrim' command the only NetWorker process that's actually responsible
for removing entries from the CFI (i.e. entries that are beyond their
browse policy and have no dependencies)? Also, I assume it's likewise
the boss for changing save sets in the media database from browsable to
recoverable or recoverable to recyclable, right?
Assuming this is the case, I'm a little confused about what actually
happens. The man page for nsrim mentions that it gets run by savegrp
after savegrp completes, but takes no action unless /nsr/mm/nsrim.prv >=
23 hours old, so it should only get run once a day. When it runs, I see
something like this as far as the processes that appear related:
4170 1 /usr/sbin/nsrexecd
1009 4170 /usr/sbin/nsrim -MXq
1103 1009 /usr/sbin/nsrck -MXv -q client1 client2 ... blah-blah-blah
... client149 client150
Based on the PID and PPID values, it looks like nsrim launches the nsrck
command with the 'X' option, which the nsrck man page shows equivalent
to '-L 3':
Level 3 does a level 2 check and reconciles the online file index with
the online media index. Records that have no corresponding media save
sets are discarded. Also all empty
subdirectories under db6 directory are deleted.
Questions:
1. This 'nsrck -MXv' check (wherein 'X' = '-L 3') doesn't seem to have
anything to do with actually removing entries that are no longer
browsable. And nsrck certainly isn't going to do anything with the media
database, so is this nsrck check just a 'nice' extra that nsrim spawns
because it loves us and want us to be happy?
2. I assume that if you run 'nsrim -X' manually that this would likewise
run that same nsrck check, too, or does this only happen when nsrim is
being run in master mode?
3. Why does the 'nsrck' check run on every darn client that's ever
existed? I see it run through old defunct clients that no longer exist;
there are no NSR client resources and/or directories under /nsr /index
for most of them. The whole nsrim/nsrck check only take a few minutes to
complete, and never has any issues, but just curious. Maybe it looks in
the media database since those clients still have old entries?
4. Why does the man page mention that nsrim is not normally run
manually? But it then goes onto say:
If save sets need to be monitored for their browse and retention policy
more frequently (for example, if savegrp(8) is run more frequently than
every 23 hours), nsrim -X should be set up as a cron(1m) entry, or
should be run manually.
5. I don't need them to be monitored more than once a day, but we do
have a number of groups that run throughout the night, so savegrp is
certainly run more than once a day. Should I set up the 'nsrim -X'
command to run out of cron?
6. What happens if 'nsrim -X' is running when groups are backing up?
This seems hard to avoid when the nsrim.prv file reaches the 23 hour
mark, no groups run for several hours after that time but then multiple
groups start at the same time. Only one is going to invoke it, but the
others are still running or may over lap before the nsrim command completes.
7. The man page for nsrim mentions:
/nsr/tmp/.nsrim
nsrim locks this file to prevent more than one copy of itself from
thrashing the media database.
"Thrashing" is not *necessarily* "trashing", so what's the worst case here?
8. If I'm going to run 'nsrim -X' out of cron, or via some cron script,
then is there any need to worry about locking the /nsr/tmp/.nsrim file,
just in case, and then remove the lock once done??? Will flock work okay?
I don't know how nsrim is locking this file, but it's mtime and ctime
never changes (it has an old timestamp) like it would with flock so I'm
unclear how or if nsrim would even know that I had a lock on it.
Alternatively, suppose I first manually set the time on the nsrim.prv
file to be sometime the next day (touch -d 'YYYY-MM-DD HH:MM:SS'
nsrim.prv) at a quiet time when no backups are usually scheduled and
then set the cron job to start maybe two hours before that time. I could
first check the timestamp on this file and only run 'nsrim -X' if it's,
say, >= 22 hours but < 22.5 to be safe, so as to avoid any possible
collision with NetWorker and/or some unexpected group that might launch
nsrim.
Then again, maybe its completely moot if nsrim really does lock itself,
so I would be fine to run it any time without worries that that the
first group that runs (after nsrim.prv >= 23 hours old) is somehow gonna
then launch nsrim in parallel/overlap with my job or some kind of race
condition?
Thanks.
George
--
George Sinclair
Voice: (301) 713-3284 x210
- The preceding message is personal and does not reflect any official or
unofficial position of the United States Department of Commerce -
- Any opinions expressed in this message are NOT those of the US Govt. -
|