Networker

Re: [Networker] nsrclone stalls

2008-03-18 09:11:55
Subject: Re: [Networker] nsrclone stalls
From: MIchael Leone <Michael.Leone AT PHA.PHILA DOT GOV>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Tue, 18 Mar 2008 09:06:23 -0400
EMC NetWorker discussion <NETWORKER AT LISTSERV.TEMPLE DOT EDU> wrote on 
03/17/2008 01:59:22 PM:

> I resolved this. Seems there was an errant entry in /etc/hosts
> on our DNS server. Strikes me as very odd symptoms. Oh well...

I never use hosts files, for this reason. Much easier to clean up 1 DNS 
server, than change multiple hosts files. IMO.

> 
> -Jim
> 
> On Wed, 2008-03-12 at 14:47 -0400, Jim Thorsley wrote:
> > Hi.
> > 
> > I'm trying to clone some savesets and it fails. I've been doing these
> > clones every month to provide off-site backup and they have been 
working
> > fine. Last Friday the network manager changed the IP addresses on the
> > backup server and storage node. Now I can't clone. Regular incremental
> > backup seem to work but when I try to clone the monthly fulls it 
stalls.
> > 
> > The command I use is:
> > 
> > nsrclone -b "$ClonePool" -S `mminfo -t '4 weeks ago' -q
> > level=full,pool="$Pool" -r ssid`
> > 
> > which gets any full backups performed over the last 4 weeks and clones
> > them.
> > 
> > What happen is that when I go to clone some savesets the library loads
> > the tape to read and one to write then just sits there. Occassionally,
> > nsrclone will spit out "nsrclone command has retried 2 times.". When I
> > add a -v to the nsrclone command it would say:
> > 
> > nsrclone: NSR server giles.csd.uwo.ca: busy
> > nsrclone: waiting 30 seconds then retrying
> > 
> > and it would continue to do this forever. I've included a chunk of the
> > nsr daemon log below.
> > 
> > The server (giles.csd.uwo.ca) is a Sun Enterprise 450 running Solaris 
8.
> > The storage node (arion.csd.uwo.ca) is a SunFire V880 running Solaris
> > 10.
> > The version of Networker on both machines is Sun Enterprise Backup 
7.4.
> > The version of networker for the clients varies from 5.5 to 7.4 but
> > since this is just between the server and the storage node I can't see
> > that makes any difference.
> > 
> > Any ideas on how to fix this?
> > 
> > -Jim
> > 
> > 
> > -start-daemon.
> log-------------------------------------------------------------
> > 0 12/03/08 11:28:51 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
Operation
> > 50 started : Load volume `FIL341L1', volume id `4258369864'.. 
> > 42502 12/03/08 11:28:51 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > waiting event: Waiting for 1 writable volumes to backup pool 'Gaul
> > Clone' tape(s) on arion.csd.uwo.ca 
> > 0 12/03/08 11:28:51 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
Operation
> > 51 started : Load volume `FKA833L1', volume id `3878346835'.. 
> > 42502 12/03/08 11:28:51 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > waiting event: waiting for LTO Ultrium-2 tape FKA833L1 on
> > arion.csd.uwo.ca 
> > 38752 12/03/08 11:29:31 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn Verify label operation in progress 
> > 38752 12/03/08 11:29:43 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn Mount operation in progress 
> > 42504 12/03/08 11:29:44 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > event cleared: Waiting for 1 writable volumes to backup pool 'Gaul
> > Clone' tape(s) on arion.csd.uwo.ca 
> > 12361 12/03/08 11:29:47 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > [Jukebox `rd=arion.csd.uwo.ca:hellmouth', operation # 50]. Finished 
with
> > status: succeeded 
> > 38752 12/03/08 11:30:01 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn Verify label operation in progress 
> > 38752 12/03/08 11:30:11 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn Mount operation in progress 
> > 42504 12/03/08 11:30:15 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > event cleared: confirmed mount of FKA833L1 on
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn 
> > 11348 12/03/08 11:30:15 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:30:15 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 2 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn with volume FIL341L1 on host
> > arion.csd.uwo.ca 
> > 42506 12/03/08 11:30:15 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
write
> > completion notice: Writing to volume FIL341L1 complete 
> > 11388 12/03/08 11:30:16 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > re-setting overall number of authorized save sessions, changing value
> > from (1) to (0) 
> > 12361 12/03/08 11:30:19 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > [Jukebox `rd=arion.csd.uwo.ca:hellmouth', operation # 51]. Finished 
with
> > status: succeeded 
> > 42502 12/03/08 11:30:45 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > waiting event: waiting for LTO Ultrium-2 tape FKA833L1 on
> > arion.csd.uwo.ca 
> > 11348 12/03/08 11:35:16 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:35:16 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 1 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn with volume FKA833L1 on host
> > arion.csd.uwo.ca 
> > 42504 12/03/08 11:35:28 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > event cleared: confirmed mount of FKA833L1 on
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn 
> > 39078 12/03/08 11:35:28 AM  0 0 2 4 20635 0 giles.csd.uwo.ca nsrjobd
> > Jobs error: Unable to find record for job 59515 during an attempt to
> > send message to it 
> > 11348 12/03/08 11:35:28 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:35:28 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 3 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn with volume FIL341L1 on host
> > arion.csd.uwo.ca 
> > 42506 12/03/08 11:35:28 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
write
> > completion notice: Writing to volume FIL341L1 complete 
> > 11388 12/03/08 11:35:36 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > re-setting overall number of authorized save sessions, changing value
> > from (1) to (0) 
> > 42502 12/03/08 11:35:58 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > waiting event: waiting for LTO Ultrium-2 tape FKA833L1 on
> > arion.csd.uwo.ca 
> > 11348 12/03/08 11:40:35 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:40:35 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 2 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn with volume FKA833L1 on host
> > arion.csd.uwo.ca 
> > 42504 12/03/08 11:40:39 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > event cleared: confirmed mount of FKA833L1 on
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn 
> > 11348 12/03/08 11:40:39 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:40:39 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 3 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn with volume FIL341L1 on host
> > arion.csd.uwo.ca 
> > 42506 12/03/08 11:40:39 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
write
> > completion notice: Writing to volume FIL341L1 complete 
> > 11388 12/03/08 11:40:45 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > re-setting overall number of authorized save sessions, changing value
> > from (1) to (0) 
> > 42502 12/03/08 11:41:09 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > waiting event: waiting for LTO Ultrium-2 tape FKA833L1 on
> > arion.csd.uwo.ca 
> > 39078 12/03/08 11:44:52 AM  0 0 2 4 20635 0 giles.csd.uwo.ca nsrjobd
> > Jobs error: Unable to find record for job 59515 during an attempt to
> > send message to it 
> > 39078 12/03/08 11:44:52 AM  0 0 2 4 20635 0 giles.csd.uwo.ca nsrjobd
> > Jobs error: Unable to find record for job 59515 during an attempt to
> > send message to it 
> > 11348 12/03/08 11:45:40 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:45:40 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 1 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn with volume FKA833L1 on host
> > arion.csd.uwo.ca 
> > 42504 12/03/08 11:45:50 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
media
> > event cleared: confirmed mount of FKA833L1 on
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn 
> > 11348 12/03/08 11:45:50 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:45:50 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 3 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/3cbn with volume FIL341L1 on host
> > arion.csd.uwo.ca 
> > 42506 12/03/08 11:45:50 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd 
write
> > completion notice: Writing to volume FIL341L1 complete 
> > 11388 12/03/08 11:45:55 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > re-setting overall number of authorized save sessions, changing value
> > from (1) to (0) 
> > 11348 12/03/08 11:50:52 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > deactivating mmd #0 
> > 11349 12/03/08 11:50:52 AM  2 0 0 1 20621 0 giles.csd.uwo.ca nsrd
> > Calling mm_deactivate for mmd 2 thats using device
> > rd=arion.csd.uwo.ca:/dev/rmt/1cbn with volume FKA833L1 on host
> > arion.csd.uwo.ca 
> > -
> 
end-------------------------------------------------------------------------
> > 
> > -- 
> > Jim Thorsley <thor AT csd.uwo DOT ca>
> > Dept. of Computer Science, U.W.O.
> > 
> > To sign off this list, send email to listserv AT listserv.temple DOT edu 
> and type "signoff networker" in the body of the email. Please write 
> to networker-request AT listserv.temple DOT edu if you have any problems 
> with this list. You can access the archives at http://listserv.
> temple.edu/archives/networker.html or
> > via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
> -- 
> Jim Thorsley <thor AT csd.uwo DOT ca>
> Dept. of Computer Science, U.W.O.
> 
> To sign off this list, send email to listserv AT listserv.temple DOT edu 
> and type "signoff networker" in the body of the email. Please write 
> to networker-request AT listserv.temple DOT edu if you have any problems 
> with this list. You can access the archives at http://listserv.
> temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>