Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED)
2007-04-06 16:12:43
Reed, please see my responses interspersed below.
Reed, Ted G [IT] wrote:
Good point in using ssflags as a saveset health check. How would you code to vet
the ssflags as kosher before the nsrclone portion? The '!incomplete' just means not
incomplete savesets.....so currently running jobs (in-progress) and aborted jobs are not
duplicated, only finished savesets. But we don't make any checks on ssflags other than
the inferred "it can't have an 'i' or 'I' in ssflag" inherent in the
!incomplete. So it would clone regardless of pre-set suspect status (as you describe in
your email), but I would want it to have that behaviour. If it fails at cloning, I'd
rather have the failure and see it so I know to check the original saveset/media for
errors.
You might also be able to employ the '-v' flag as described in mminfo
man page wherein it says "... The second flag indicates the status of
the save set. A b indicates that the save set is in the online index and
is browsable via the recover(1m) command. An r indicates that the save
set is not in the online index and is recoverable via the scanner(1m)
command. An E indicates that the save set has been marked eligible for
recycling and may be over-written at any time. An a indicates that the
save was aborted before completion. Aborted save sets are removedfrom
the online file index by nsrck(1m). An i indicates that the save is
still in progress.".
Not saying you have to do this, but you might want to consider it. Isn't
it possible that it could have been marked (a)borted somehow? Also, if
the save set is somehow aborted or not completed successfully, I think
-v will also typically show 'E' for recycle, which in this case is too
new to have that value via the normal means. In other words, something
went wrong. I would think that you could simply check the ssflags and
the -v flags to ensure that they both show a healthy save set. So, for
example, -v should not show something like 'ca' and should instead show
something like 'cr' or 'cb' or 'hb' or 'mb' or whatever, just as long as
the second flag is not 'a' or 'E' or 'i'. Likewise, ssflags should be
vrF or vF. I think you'd need to run these separately, though. If it's
anything else then maybe have it email you or let you know that the
affected save sets were not cloned due to something suspicious in the
original save set. However, as you said, you're using the cloning
procedure to determine this anyway so maybe it's moot. I guess my point
is that I would be hesitant to attempt a clone on something that looked
to be a flagrant problem because I don't want to waste space on a clone
tape, only to have it bomb part way through, particularly on a large
save set because I can't get my space back, but I might try to first
clone it off to some other "test clone" pool instead before using the
real one to see what it does. Just an idea, but maybe using '-v' and/or
also looking at the ssflags could make your script even more full proof.
Then again, maybe just extra work [sigh].
We have a total of ~120 otherwise-valid-but-listed-as-suspect savesets
(vF, vrEF, etc), of which 26 are on Clone media, the other 95 or so are in the
BR Pool.....and many are spanned volumes so a single suspect saveset may mark
suspect as many as a dozen volume references. So that implies that during the
read back of saveset for clone creation, there was a read-error on the original
tape. But that'd usually never come to light if we didn't run the clone. And
even so, the copy is not necessarily suspect. For example, the below is a case
where the original is marked suspect (due to some read error on one of the
tapes) but the clone is not....and I'd believe it's fully recoverable, if not
both original and clone recoverable.
volume client pool ssid ssflags clflg
M20760 pdahlb01z BR Pool 1023713736 vF s
M20824 pdahlb01z BR Pool 1023713736 vF s
M21834 pdahlb01z Clone Vault TX 1023713736 vF
M21874 pdahlb01z Clone Vault TX 1023713736 vF
M21935 pdahlb01z BR Pool 1023713736 vF s
Also, are you sure a failed clone will result in a copy count increase? That should only be true if it was successful, else there isn't multiple copies.
Well, you may be right, but I ran a command like this:
mminfo -s orion -q 'copies=2' -ar
client,name,type,savetime,volume,copies,ssid,ssflags,clflags | egrep 's$'
I get several lines of output. All the affected save sets are from back
in 2003 and 2004 wherein the clflags value is 's' and the 'ssflags'
value is 'vrEF', but in each case, only the save set from the original
volume is shown, not the clone save set. However, each of these does
have a clone save set. The clone save sets have the same values for the
ssflags but nothing for clflags. Is it the case the clflags value is
only for the original save set and does not apply to clones?
George
Thanks much for the feedback. I've seen your threads and commentary for a
long time on the list and very much welcome your input. Thanks again.
--Ted
-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU] On
Behalf Of George Sinclair
Sent: Friday, April 06, 2007 10:57 AM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Simple UNIX Clone script - for use by community
(UPDATED)
Shouldn't you also be checking the ssflags and clflags values for the original
save sets before cloning them? I've seen cases where an original save set was
marked 'suspect' even though it had never been read during a clone operation.
I've seen cases, too, where a save set completed (no abort) but whose ssflags
suggested a something was awry. I would want to ensure that the ssflags were
golden before I started a clone on it. Is this what the '!incomplete' flag is
accomplishing? Also, what if the save set was previously cloned but not
successfully?
It's value for 'copies' would be >= 2 but would still need to be cloned again, obviously to a different volume unless 'nsrmm -d -S ssid/cloneid'
was run.
George
Reed, Ted G [IT] wrote:
Is it Monday again. Arghhhh.....here's the final, really right version.
mminfo -omo -r ssid -q "!incomplete,pool=BR
Pool,copies=1,level=full,location='$JB'" -t "$TIME" > $LOG_DIR.$JB
-----Original Message-----
From: Reed, Ted G [IT]
Sent: Friday, April 06, 2007 10:26 AM
To: 'EMC NetWorker discussion'; Reed, Ted G [IT]
Subject: RE: [Networker] Simple UNIX Clone script - for use by
community (UPDATED)
Sorry folks, I cut/pasted an old, nonworking version of the mminfo (bad
quotation management for variable interpretation). Here's the RIGHT version.
So the script below, plus this fix, equals working UNIX script again.
mminfo -omo -r ssid -q "!incomplete,pool=BR
Pool,copies=1,level=full,location='$SN'" -t "$TIME" > $LOG_DIR.$SN
-----Original Message-----
From: EMC NetWorker discussion [mailto:NETWORKER AT LISTSERV.TEMPLE DOT EDU]
On Behalf Of Reed, Ted G [IT]
Sent: Thursday, April 05, 2007 11:29 AM
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Subject: Re: [Networker] Simple UNIX Clone script - for use by
community (UPDATED)
NOTE: Outlook likes to remove "extra" line breaks sometimes.....which is really bad for scripting.
Be sure you check to see if it says "Extra line breaks in this message were removed", and if it
does, click on the message and "Restore line breaks". Otherwise, some of the code gets mangled.
Thanks.
--Ted
I'd like to thank Conrad Macina for providing an elegant piece of code to replace the ugly 'echo "JUKEBOXNAME" >> FILE' hard code repeats we had in place to list all defined jukeboxes. Why edit when you can look it up and maintain dynamic functionality?
PS. Reminder that the 'pool=BR Pool' in the mminfo call and the '-b "Clone Vault TX"' in the nsrclone are both specific to my environment and should be edited to meet your environment's needs. See previous thread entries for greater detail.
----------BEGIN CODE----------
#!/bin/ksh
#
# Name: clone_script.sh
#*******************
# Script Variables *
#*******************
LOG_DIR="./ssid"
JB_NAMES="./JB_NAMES"
TIME="one week ago"
#*********************************************************************
************************* # Change directory used because ps -ef
output was too long and was breaking the if statement. *
# By using the change directory we shorten the ps -ef output to manageable length *
#*********************************************************************
*************************
cd /usr/local/Legato/clone
#**************************************************
# This writes the jukebox names into a text file. *
#**************************************************
echo "show name \n print type:nsr jukebox" | nsradmin -i - | grep
"name:" | awk '{print $NF}' | tr -d '";' | sort > $JB_NAMES
#*********************************************************************
*************************************
# Verifies the last nsrhost has finished. Want clone to include most
recent master backup and bootstrap. *
#*********************************************************************
*************************************
while ps -ef | grep nsrhost | grep -v grep do
sleep 300
done
#*********************************************************************
*****************************************
# Section to check for running clones, if not generate list of save
sets to be cloned and executes the clone. *
#*********************************************************************
*****************************************
for JB in $(cat $JB_NAMES)
do
if ! ps -ef | grep nsrclone | grep $JB | grep -v grep > /dev/null
then
mminfo -omo -r ssid -q '!incomplete,pool=BR
Pool,copies=1,level=full,location=$JB' -t $TIME > $LOG_DIR.$JB
nsrclone -b "Clone Vault TX" -S -f $LOG_DIR.$JB &
fi
Done
----------END CODE----------
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with
this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or via RSS at
http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
To sign off this list, send email to listserv AT listserv.temple DOT edu and
type "signoff networker" in the body of the email. Please write to
networker-request AT listserv.temple DOT edu if you have any problems with
this list. You can access the archives at
http://listserv.temple.edu/archives/networker.html or via RSS at
http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
--
George Sinclair - NOAA/NESDIS/National Oceanographic Data Center
SSMC3 4th Floor Rm 4145 | Voice: (301) 713-3284 x210
1315 East West Highway | Fax: (301) 713-3301
Silver Spring, MD 20910-3282 | Web Site: http://www.nodc.noaa.gov/
- Any opinions expressed in this message are NOT those of the US Govt. -
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request AT
listserv.temple DOT edu if you have any problems with this list. You can access the archives
at http://listserv.temple.edu/archives/networker.html or via RSS at
http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
--
George Sinclair - NOAA/NESDIS/National Oceanographic Data Center
SSMC3 4th Floor Rm 4145 | Voice: (301) 713-3284 x210
1315 East West Highway | Fax: (301) 713-3301
Silver Spring, MD 20910-3282 | Web Site: http://www.nodc.noaa.gov/
- Any opinions expressed in this message are NOT those of the US Govt. -
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [Networker] Simple UNIX Clone script - for use by community, Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community, Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), George Sinclair
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Reed, Ted G [IT]
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Landwehr, Jerome
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED),
George Sinclair <=
- Re: [Networker] Simple UNIX Clone script - for use by community (UPDATED), Pratt, Matthew [IT]
|
|
|