Networker

[Networker] Identifying suspect savesets and clflags attribute?

2004-07-23 15:31:53
Subject: [Networker] Identifying suspect savesets and clflags attribute?
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 23 Jul 2004 15:33:57 -0400
Hi,

Is there a command or a way to script or query the database for
"suspect" savesets? Can the clflags attribute really show suspect
savesets that have not been cloned? If so, seems like a contradictory
attribute name if you didn't clone the saveset(s).

Here's the deal. We had a large (137 GB) full saveset (3 volumes) that
was identified in the GUI saveset recover and volumes window with a
status of 'browsable suspect'. However, nobody noticed this at the time.
Recently, we had to recover this saveset. Everything went smoothly on
the first tape, and it read part of the data on the second tape and then
generated a read error:

Jul 18 06:51:29 server root: [ID 702911 daemon.notice] NetWorker Media:
(info) loading volume FUL718
into rd=snode:/dev/nst5
Jul 18 09:38:13 server root: [ID 702911 daemon.notice] NetWorker media:
(info) can not read record
6328 of file 135 on sdlt tape FUL718

It then immediately moved onto the third  (last) tape, read everything
there okay and completed, but the recovery was incomplete due to the
tape 2 problem. We got our data back by going back to a previous full
with subsequent incrementals. I guess the read error is not surprising
given that the status was 'suspect'. Anyway, after all this, I'd really
like a way to identify this in the future by possibly scripting
something to check the database so I don't have to crawl through every
saveset in the recover window.

First, we do have "Auto media verify" turned on for all pools, and I
checked the savegroup completion notification for the group, and that
savset was listed with a 'V' in front of it, not that means much. Next,
I checked the man page for mminfo, and I see that there is a 'suspect'
option, but this seems to be used for identifying savesets that were
reported as such during recovery and not backup. Also, I see this
clflags option that can report suspect stuff, but that seems to be for
clones. The man page describes clfags as: "The clone flags summary, from
the set ais for aborted, incomplete and suspect (read error),
respectively." I know cloning can validate the readability of a saveset
since it has to read the data as it's cloning it, but this data we'd
prefer not to clone. We can re-run a backup if we know it's suspect,
though, and we'd like to know as soon after it completes, before too
much time has passed.

Anyway, I tried running mminfo, using the ssid of the saveset both with
and without the clflags option, and I do see an 's' reported, but only
when running with the clflags, and these tapes have never been cloned.

mminfo -vq 'ssid=3834391297,valid' -r
'volume,client,name,state,ssflags,sumflags,clflags'
 volume        client   name                              ssflags fl
clflg
FUL716         client1  /1-raid5/exports/www              vF     hb s
FUL718         client1  /1-raid5/exports/www              vF     mb s
FUL720         client1  /1-raid5/exports/www              vF     tb s

It seems odd that clflags would report something that has not been
cloned, but if that's the magic ticket then that's fine by me. Is this
the best or only way to do this? Does "Auto media verify" factor into
this at all?

Any help with this would be appreciated.

Thanks.

George

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=