Networker

Re: [Networker] savepnpc weirdness (7.5.1) - multiple runs per backup

2009-11-10 11:00:58
Subject: Re: [Networker] savepnpc weirdness (7.5.1) - multiple runs per backup
From: "Nelson, Allan" <an AT CEH.AC DOT UK>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 10 Nov 2009 15:58:27 +0000
It's a long time since I've used savepnpc but I remember trying it several 
years ago and hitting a snag where it seemed that if you specifically 'named' 
the partitions to backup, then it ran it for each one of those (whereas if you 
simply said 'All' it worked OK).
Sorry - the memory on this is hazy, but does that tie in with what you're 
seeing?


Allan Nelson
CCS Lancaster


-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On 
Behalf Of Len Philpot
Sent: 09 November 2009 21:28
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: [Networker] savepnpc weirdness (7.5.1) - multiple runs per backup

Anyone seen a behavior from savepnpc where it runs multiple times for a 
given backup? Note I'm talking about savepnpc, not savepc, which is 
designed (AFAIK) to run once per saveset.

We have a couple of (Solaris) database clients that use savepnpc scripts 
to (call other scripts and) shutdown/restart their DBs before and after 
backups, both on 7.2 and 7.5.1. Works great on 7.2, been doing it for 
several years. On 7.5.1, we've seen where apparently Networker runs the 
shutdown and restart scripts numerous times in quick succession before 
finally seeing there's something backing up and thus waiting to run the 
final restart when the backup completes. Of course, these backups are no 
good since the DBs were bouncing like a ball during them.

Here are a few key events, for example, from a group that ran at 2 am...

>From the /nsr/logs/savepnpc.log on the client:

11/09/09 02:00:00 preclntsave: Starting up the precmds.
11/09/09 02:00:43 preclntsave: All command(s) ran successfully.
11/09/09 02:01:44 pstclntsave: Client is not active in the worklist.
11/09/09 02:01:44 pstclntsave: All savesets on the worklist are done.
11/09/09 02:01:44 pstclntsave: Starting up the pstcmds.
11/09/09 02:02:28 pstclntsave: All command(s) ran successfully.
11/09/09 02:02:28 pstclntsave: Exited.

The above repeats six more times, bouncing the DB each time, until 
finally...

11/09/09 02:39:37 preclntsave: Starting up the precmds.
11/09/09 02:40:42 preclntsave: All command(s) ran successfully.
11/09/09 02:41:42 pstclntsave: Client is still active in the worklist.
11/09/09 02:41:42 pstclntsave: Worklist not complete. Some saves are still 
running.
11/09/09 02:42:42 pstclntsave: Client is still active in the worklist.

The above repeats seven more times during the backup, presumably a normal 
polling process from Networker, until...

11/09/09 02:49:42 pstclntsave: Savegroup is not running.
11/09/09 02:49:42 pstclntsave: All savesets on the worklist are done.
11/09/09 02:49:42 pstclntsave: Starting up the pstcmds.
11/09/09 02:50:26 pstclntsave: All command(s) ran successfully.
11/09/09 02:50:26 pstclntsave: Exited.


And, from /nsr/logs/daemon.log on the server:

11/09/09 02:00:50  nsrd client1:/dir1 saving to pool 'diskpool' 
(diskvol.001)
11/09/09 02:04:49  nsrd 913 MB are saved to pool 'diskpool' (diskvol.001) 
of client1:/dir1
11/09/09 02:04:49  savegrp job (1792176) host: client1 savepoint: /dir1 
had WARNING indication(s) at completion.
11/09/09 02:05:37  nsrd client1:/dir2 saving to pool 'diskpool' 
(diskvol.001)
11/09/09 02:11:45  nsrd 1670 MB are saved to pool 'diskpool' (diskvol.001) 
of client1:/dir2

...etc. until the group completes.

We originally wondered if savepnpc was running the pstcmd too quickly, 
before Networker had built the worklist, thinking it was already done. But 
the daemon.log shows that the client was actually backing up within a 
minute of the group starting, so... Maybe the worklist is built concurrent 
with whatever savepnpc is triggering, but it just (hopefully!) waits until 
the precmd is done before actually starting the backup. In the case above, 
backups started 7 seconds after the first precmd completed, with the first 
pstcmd running about a minute later.

Any ideas??

Thanks.

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

-- 
This message (and any attachments) is for the recipient only. NERC
is subject to the Freedom of Information Act 2000 and the contents
of this email and any reply you make may be disclosed by NERC unless
it is exempt from release under the Act. Any material supplied to
NERC may be stored in an electronic records management system.

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER