Networker

Re: [Networker] How best to determine time value for this scenario?

2009-03-17 04:53:53
Subject: Re: [Networker] How best to determine time value for this scenario?
From: Allan Nelson <an AT CEH.AC DOT UK>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 17 Mar 2009 08:39:40 +0000
Hi George
I may be totally missing the point here, but wouldn't it be much simpler to add 
a field to your non-Legato database that indicated whether that had been saved 
or not?
ie you're already extracting a list of 'stuff to do' from the database and then 
presumably issuing a save command?
After the save command completes, can you not then update the database flag to 
say 'done'?
I'd certainly try to take that sort of tack and then you've no problems with 
dates/times, was it successful etc etc.
Of course if you haven't got write access to that local database, then forget 
what I just said ;-)
 
Hope this helps... Allan.

>>> George.Sinclair AT NOAA DOT GOV 16/03/09 22:50 >>>

This may be outside the purview of this news listing, but thought I'd 
run this past the gang just to get some advice. Sorry to make this so 
long, but though it necessary to provide the details.

We have a special archive data set wherein we want to use NW to perform 
the backups on certain directories but not to determine what needs to be 
backed up among that data. We want each directory to be its own save set 
instead of having it all get backed up under the parent device name, 
e.g. /0/data, /1/data, etc. There's a non-Legato database that I can 
query that will report the directories/pathnames that have been archived 
since a specific date/time, but I need to pass it a date/time value. In 
some cases, the same directory might be re-archived again later in which 
case its date will be updated in the database.

We want to use NetWorker to perform the backups on these paths at level 
fulls with no indexing turned on for the pool. This is because 1. we do 
not want to use the file system to determine what needs to be backed up, 
2. We sometimes move this data around between systems and don't want to 
have to rerun level fulls like we would if we were performing indexed 
backups by device., 3. This is much faster as it doesn't have to search 
through the file system to determine what has changed, and 4. We only 
want to back up the directory if it has been added to that database, not 
just because it's there on disk. Also, this makes it clearer as to what 
exactly got backed up since each will be its own save set. We can't hard 
code the save set list, though, because it changes every day. We do not 
mind using saveset recover to recover these.

The plan is to specify a script for the Backup command field in the 
client resource that will determine the appropriate time value, obtain 
the necessary path names from the non-Legato database and then run a 
save command on each at a level full with the -N option for the symbolic 
save set name. Cloning would be enabled for the group.

But how to determine the time value????

To do this, I was thinking that I could do this:

1. Run an mminfo query on the pool and sort by time and then use the 
last time value and pass that to the non-Legato database (maybe subtract 
5 minutes just to be safe as I don't mind an occasional overlap), but it 
occurs to me that a major problem with that is that it is possible that 
one or more paths may have been added to that database after the start 
time when the group *last* ran but before it completed. As a result, I 
would miss those and end up only grabbing the ones that were updated 
after that last save set completed from the previous backup.

2. While the start time for the backups probably will not change, it 
could, and it could always be the case that the backups didn't actually 
run on that date/time due to skip, other problems, etc. so using the 
last group start time value might be too recent.

3. The backup script could first touch a file, and each subsequent run 
uses the time value of that file before touching it again, but what if 
the script runs several days in a row and fails, and then the next day 
succeeds, but now it's specifying the time value of when it last ran, 
but prior to that at least one path was archived. I would then miss that.

So I guess I need to be able to pass it the time that the group actually 
last started and really did something. But, let's say the last time the 
group ran, there wasn't anything to backup, or maybe it skipped. Maybe 
the backups were shut off for a couple of days, too. It will still 
report a last start time value, but that might be too recent. I'm 
thinking that I need to determine the date and time that the backups 
last ran and actually did back up something and then use that time instead?

If I walk back through the savetimes in reverse, looking at all of them 
that are from the same day, I could pick the first such one, maybe 
subtract 5 minutes to be safe, and make that my time value, but how do I 
know that the group didn't actually start the night before and continue 
over to that day? If it did, and something was archived just after it 
started, I would miss that one, too.

Any one have any ideas about how best to determine what time value to 
use in this case?

George
-- 
George Sinclair
Voice: (301) 713-3284 x210
- The preceding message is personal and does not reflect any official or 
unofficial position of the United States Department of Commerce -
- Any opinions expressed in this message are NOT those of the US Govt. -

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER 


-- 
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.


To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>