Networker

Re: [Networker] Problem child tape -- what to do???

2004-04-27 19:44:35
Subject: Re: [Networker] Problem child tape -- what to do???
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Tue, 27 Apr 2004 19:45:58 -0400
Darren,

Please see my responses below.

Thanks.

George

Darren Dunham wrote:
>
> > explain, and I have some questions on how best to proceed.
> >
> > 1. Output from: mminfo -aq 'volume=volume_name' shows this volume as
> > having
> > 7 savesets. No problems here. Whether we believe it or not is another
> > matter.
>
> Probably shouldn't.
>
> > 2. When I ran the scanner command to scan the tape to obtain its
> > volid, I ran it as: scanner -inv <drive> and saved the output to a
> > file (sample output included below). I then went in to look at the
> > contents, and I noticed several
> > interesting things. First, among all the voluminous output, there were
> > 5 different ssids reported, but 99.9% of the file refers to only three
> > of them, and so there are only a few places where the other 2 are
> > referenced. None of the 5 lists in the database. I ran mminfo -aq
> > 'ssid=ssid_number' for each and none turned up! The other interesting
> > thing is that none of these ssids match any of the ssids reported in
> > item 1 above!
>
> Since the volids are different, I wouldn't expect the ssids on the tape
> to match those for the volume name in the database.  I wouldn't be
> surprised if they partially matched those on other volumes, but it looks
> like you weren't that lucky.

Guess not.

>
> > One of the two ssids that's only reported a few times
> > corresponds to a file system that wasn't even reported in 1
> > above. There are also some error messages near the end of the
> > file. Also, near the end, it reports that there is a continued saveset
> > on another tape, and it reports the file system and the ssid, but this
> > file system again is not one of the 7 listed in item 1, and I checked
> > for the ssid and it's not listed in the database either. I've provided
> > a sample output below.
> >
> > I'm beginning to wonder if the on-line file index entries for the data
> > on this tape is just completely bogus -- as in it matches some other
> > volume and not this one, or maybe it's just a fantasy.
>
> I would imagine it was reality at one point in time.  However the volume
> that it refers to may no longer exist, or may no longer be accessible.
>
> Have you ever had to recover the media database on this server?  Does
> the savetime of either the real volume or the media volume correspond to
> that time?

Not in a very long time. Several years ago.

>
> Is there any chance that there was a duplicate volume created in the
> past either by reusing a barcode or by creating an accidental duplicate
> barcode?

There is definitely that possibility. The one strange thing here, too,
is the fact that the tape had been out of the library for quite some
time. I came across it one day and noticed it because it did not have a
barcode or even a plain label to identify what it was. It was write
protected, however. Of course, I inventoried it to find out what it was,
and that's when NetWorker identified it as FUL649 with savesets from
early March 2004. I then tried to mount it, and I don't even remember
now why, and that's when I got the error about the volume not being in
the index.

>
> > Or, is it
> > instead the other other way around? I guess I was planning to rebuild
> > just the media database by running: scanner -m <drive> but at least
> > one person suggested maybe deleting the volume first. If I do that,
> > however, then won't the file indexes get wiped out, too?
>
> Yes they would.  But they're almost certainly for some other volume (not
> the one in your hands).  Unless you can find that other volume, the file
> (and media) indexes are useless.

I don't know how I would find it.

>
> > Seems then
> > that I would need to run scanner -i to rebuild both
>
> correct.
>
> , but I'm thinking
> > maybe I don't wanna do that based on some of the error messages ... I
> > mean
> > maybe that might end up corrupting some of the valid online file index
> > information. Maybe just rebuild media database and not delete volume
> > first?
>
> The problem is trying to figure out what really happened here.  Does the
> time for the savesets and the name on the savesets (on the physical
> media) make sense for your environment?  Do you have other savesets in
> your database from that same day?  *should you*? (perhaps normally they
> should have expired).

We have a one year retention policy and a 1-2 month browse policy on all
clients. We never place tapes that are eligible for recycling into the
libraries unless they are write-protected. The only time a tape is ever
recycled is manually by myself and then, of course, I remove the
write-protection first. The only saveset name that I saw from the
scanner command was hostname:/4 and hostname:/db. The ssids that were
reported for these do not show up in the database (checked as: mminfo -s
server -avq 'ssid=value'), and hostname:/db was not listed among the
savesets reported via the volumes window for this tape, but hostname:/4
was.

>
> > scanner: scanning file 72, record 400
> > scanner: scanning file 72, record 500
> > scanner: fn 72 rn 506 read error Input/output error
> > scanner: Opened /dev/nst4 for read
> > scanner: fn 73 rn 0 read error Input/output error
> > scanner: Opened /dev/nst4 for read
> > scanner: fn 73 rn 0 read error Input/output error
> > scanner: Opened /dev/nst4 for read
> > scanner: ssid 1093463041: NOT complete
> > scanner: ssid 1093463041: 93 GB, 771443 file(s)
> > scanner: done with sdlt tape FUL649
>
> Hmm.  That is a bit strange.
> >
> > scanner: Rewinding...
> > scanner: Rewinding done
> > scanner: the following save sets continue on another volume:
> > client name  save set             save time     level   size  files
> > ssid    S
> > client1     /raid                2/29/04  0:00  f 95378219156 771443
> > 1093463041 S
> > scanner: when next volume is ready, enter device name (or `q' to quit)
> > [rd=snode:/dev/nst4]?q
> > scanner: (ssid 1093463041) error decoding save stream
> > scanner: (ssid 1093463041) would have added 293506 new file index
> > entries
>
> If you don't have the continuation savesets in your database, and you
> can't find any sign of them, I wouldn't scan them in.

I concur.

>
> I might also do a spot check of some other "similar" tapes (perhaps from
> the same time frame) to see if any of them have volid mismatches.

I'll do that.

>
> --
> Darren Dunham                                           ddunham AT taos DOT 
> com
> Senior Technical Consultant         TAOS            http://www.taos.com/
> Got some Dr Pepper?                           San Francisco, CA bay area
>          < This line left intentionally blank to confuse you. >
>
> --
> Note: To sign off this list, send a "signoff networker" command via email
> to listserv AT listmail.temple DOT edu or visit the list's Web site at
> http://listmail.temple.edu/archives/networker.html where you can
> also view and post messages to the list.
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>