Networker

Re: [Networker] corrupt media databases

2002-10-07 15:30:33
Subject: Re: [Networker] corrupt media databases
From: Tim Mooney <mooney AT DOGBERT.CC.NDSU.NODAK DOT EDU>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Mon, 7 Oct 2002 14:30:30 -0500
In regard to: [Networker] corrupt media databases, Kevin Maguire said (at...:

> [...] but anyway on reboot I could not
>start networker, message was:
>
>10/05/02 16:53:57 nsrd: server notice: started
>10/05/02 16:54:03 nsrmmdbd: WISS error: Unable to mount /nsr/mm/mmvolume6: bad 
>database header
>10/05/02 16:54:03 nsrmmdbd: media db must be scavenged
>10/05/02 16:54:12 nsrmmdbd: media db scavenge successful
>10/05/02 16:54:12 nsrmmdbd: WARNING: clients file missing from 
>/nsr/mm/mmvolume6
>10/05/02 16:54:13 nsrmmdbd: error adding btrees to ss (an invalid slot number)
>10/05/02 16:54:13 nsrmmdbd: WISS error: an invalid slot number
>10/05/02 16:54:28 nsrd: nsrmmdbd has exited with status 1
>10/05/02 16:54:28 nsrd: shutting down
>
>...
>
>mminfo -m showed I had no media!  I stopped networker, removed
>/nsr/tmp and re-created it, and tried again.  Did not help.
>
>Trawling through this lists archives I came to the conclusion I needed
>to recover with mmrecov.  I'm not sure it was the right conclusion, but
>it was what I did.

Pretty much yes, though to be safe you may want to rename the old media
database directory, and to be really safe you would want to remove
NetWorker (telling it not to delete the client indexes), remove all the
directories under your /nsr directory *except* the client index directory,
and then reinstall networker and then start the `mmrecov'.   In practice
this isn't usually necessary (in my experience) but it does guarantee that
you're starting out clean, and that you don't have any junk database files
hanging around that may just cause problems again down the road.

>This went OK in some sense, but my last bootstrap was from 48 hours
>ago.  I *know* that a lot of successful save sets were written after
>that but before the system crash, I got the savegrp completion
>e-mails.  However now I cant see them, as my media database is
>restored to what it was 48 hours ago!

Exactly.

>Is this the best I can expect?

No.  Assuming you don't have to rebuild any client indexes, and you want
to clue NetWorker in about all the savesets that happened post-bootstrap,
you need to figure out which tapes have been written, and use scanner to
sync up NetWorker's idea of what savesets exist with the reality that
exists on your tapes.

>I saw from my logs that about 10 volumes were relabelled in the
>time between bootstrap and crash, messages like:
>
>10/04/02 18:24:58 nsrd: deleted media notice: Deleted volume: volid=712010241, 
>volname=000079, location=STK9710
>
>These were indeed recyclyable volumes, if I do
>
>mminfo -q volid=712010241
> or
>mminfo -q volname=000079
>
>I see the savesets, all marked as recyclable, from months ago.
>However I know that volume now contains new savesets, but I dont know
>how to tell legato to read it back in.

It might be tricky.  I would first try using

        scanner -i /dev/rmt/whatever

to read a tape or two of the ones that have been recycled and have new
data that NetWorker doesn't know about.

If that doesn't work, you might try actually removing the tape from
NetWorker's media index (so that NetWorker isn't stuck thinking the tape
is recycleable) and then running `scanner' on it.  This shouldn't be
necessary -- just using scanner should do it.

>I tried using scanner - but when I load the volume with nsrjb it just
>spits it back out saying the volume is not part of the media database.

Are you asking NetWorker to load and mount (i.e. `-l' to nsrjb) or
just load (i.e. `-l -n' to nsrjb)?   Scanner doesn't need the tape
mounted via NetWorker, it just needs it ready to go in the drive.
You can load the tapes manually in a really dire situation.  If you
have some kind of robot control commands for Solaris 2.6, you can use
them to get the tape into a drive, and then run scanner on that device.

>I know that, it is the media database I want to fix!!

What options did you use with scanner?  Did you use either `-i' or `-m'?
I think I would try it with `-i', but in your case you *might* be able to
get away with just `-m', since you haven't mentioned client index
corruption and you therefore only need to tell NetWorker where certain
savesets are on tape.

Unless you missed the `-i' or `-m' to scanner, what you're doing is (I
think) the right track.

Tim
--
Tim Mooney                              mooney AT dogbert.cc.ndsu.NoDak DOT edu
Information Technology Services         (701) 231-1076 (Voice)
Room 242-J6, IACC Building              (701) 231-8541 (Fax)
North Dakota State University, Fargo, ND 58105-5164

--
Note: To sign off this list, send a "signoff" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>