Networker

Re: [Networker] Merits of cloning versus dual backups?

2003-04-21 13:31:47
Subject: Re: [Networker] Merits of cloning versus dual backups?
From: Joel Fisher <jfisher AT WFUBMC DOT EDU>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Mon, 21 Apr 2003 13:31:45 -0400
Q1) Option A vs Option B

Option A when given a file(or stdin - ) with the list of ssids will rearrange 
that list so it can clone most efficiently.  ie.  It'll clone all the requested 
ssids from volume 0001 keeping the multiplexing as is(minus the unwanted 
ssids).  Then it'll move on to a volume that either has continued ssids from 
0001 or has more ssids from the list you passed nsrclone.

FYI Option A is actually

$ nsrclone -s server -b pool -S -f file  #or
$ mminfo -r ssid -q query | nsrclone -s server -b pool -S -f -

Option B will clone 1 ssid at a time.  Running through the whole tape(at least 
until the ssid ends) and any continued tapes for each ssid.  And then start the 
process all over again for the next ssid.  Lots of unneeded overhead.  One 
benefit is that your cloned ssids will be de-multiplexed.  To me it's not worth 
the added time it takes to make the clones and the added strain on my drives, 
because passing just one ssid at a time can rarely keep my drives streaming.


 Q2) Is it multiplexing

Umm sort of... I don't know if it would technically be called multiplexing, 
because it is one clone stream(as far as I know), but the multiplexing does 
remain on the cloned volume.

Q3) Minimum number of ssids in list

Don't think there is a minimum, but if you pass it one ssid from 20 different 
volumes(1 per volume) then you see no gain because it still has to load each 
tape as it goes.  I have a set of pools labeled 
OffsiteB30DaysR30Days/OffsiteB30DaysR1Year/etc... I keep my cloned volumes 
online(in the silo) until the browser period runs out.  So all my pools 
Off...B30Days... clone to a clone pool called OnlineCloneB30Days.  For each 
backup pool I run a mminfo command like below and pass the list to nsrclone.

$ mminfo -r 'ssid' -q 
'!suspect,!inuse,!incomplete,pssid=0,copies=1,pool=OffsiteB30DaysR30Days,location=STK9310'

nsrclone then figures out the most efficient way to clone all those ssids.

Q4) As far as I can tell the order doesn't matter... I suppose it might matter 
in deciding which volume to start with(just a guess), but it certainly doesn't 
use the order you give it for the entire cloning process.

Q5) Yes

Any Legato personnel or Legato Gurus feel free to correct me if I've misstated 
anything.

Hope this helps,

Joel

-----Original Message-----
From: George Sinclair [mailto:George.Sinclair AT noaa DOT gov] 
Sent: Monday, April 21, 2003 12:18 PM
To: Legato NetWorker discussion; Joel Fisher
Subject: Re: [Networker] Merits of cloning versus dual backups?

Thanks, Joel. I appreciate you patience, but I guess I'm still confused
or ignorant of how nsrclone knows you want to do the whole volume just
because you pass it all the ssids, less the ones you want excluded. And
what's the difference between doing the whole volume versus the
sequential method? At what point does nsclone decide that you're trying
to do the whole enchilada? I see that the command does support a '-f'
option -- perhaps this is the way you invoke it? -- but there are four
questions that I think will clear up a lot of my confusion:

1. Does the '-f' option somehow make nsrclone run more efficiently? In
other words, what's  the difference between having all the ssids in a
file named my_file and running:

'nsrclone -s server -b pool -f my_file'   (Option A)

versus just looping through the file sequentially using a script and
thus invoking nsclone separately on each ssid read in as:

'nsrclone -s server -b pool -s ssid'  (Option B)

Would this not achieve the same end result? I don't see why it wouldn't.

2. If the '-f' option allows NetWorker to somehow look at all the ssids
in the list and see that they all come from the same volume, what is it
doing that it would not be doing if you ran it as in option B above?
This is the $10,000 question, and I think this is where I'm in the dark.
I see that you don't want to just clone the whole volume because there
is stuff on there that you don't want, but I don't see how a list of
what you do want would make it behave better or faster. It's not
multi-plexing when it clones by volume, right?

3. If NetWorker is somehow behaving differenly when it knows it's
cloning a volume versus sequentially then what is the minimum number of
ssids from said tape that you need to supply it to convince it of this?

4. Is the order of the ssids in the file important? Must they be listed
in the order that they appear on the original tape?

5. What if your input list or file has say 10 ssids listed from one
volume and another 5 from another. Will nsclone be smart enough to know
to clone the first volume consisting of the 10 ssids and then the second
volume of 5 ssids even though the real volumes may contain more?

Sorry to be so long winded.

Thanks.

George

Joel Fisher wrote:
> 
> Hey George,
> 
> I do pass a list of ssids to nsrclone, but since that list contains all valid 
> ssids on each volume I want to clone in practice it actually clones the whole 
> volume.  nsrclone is smart enough to take the list of ssids and backup them 
> up one volume at a time as opposed to sequentially backing up one ssid at a 
> time.  I don't use the clone volume feature because I want to be able to 
> exclude non-valid ssids(incomplete,suspect).  I suppose the "per volume" 
> might do this automatically... but this way I know for certain I'm not 
> getting any junk data on my clone tapes.
> 
> FYI I also send my original and now "verified" tapes offsite.  So any online 
> restores come from the cloned media.  This process verifies my offsite media, 
> and to a degree does a "random" verify on my cloned data.
> 
> Joel
> 
> -----Original Message-----
> From: George Sinclair [mailto:George.Sinclair AT noaa DOT gov]
> Sent: Monday, April 21, 2003 10:04 AM
> To: Legato NetWorker discussion; Joel Fisher
> Subject: Re: [Networker] Merits of cloning versus dual backups?
> 
> We found that out when someone attempted a recover. We did not have the
> blocking set correctly, or rather NetWorker was not using variable block
> size and was instead defaulting to some other default size. What this
> caused was two things. First, it took NetWorker a long time to reach the
> correct part of the tape since positioning by record was disabled due to
> the wrong block size and second, recovery of the data itself was very
> slow, and I'm thinking it much slower than it should have been, so I
> suspect that the wrong blocking size may have also contributed to slower
> recover times once NetWorker did find the start of the data. Anyway, we
> resolved the problem, but in our case, we learned the easy way, too,
> since we didn't operate like that for to long before fixing the problem.
> 
> I have a question, though, where you say "I clone full volumes". How
> then, or maybe I should ask why would you pass it a list of ssids? Are
> you in fact generating a list of ssids and the nusing some kind of
> script to process each one? If so, is it working through the list one at
> a time? Why not just use the clone volume feature so you don't have to
> generate the list?
> 
> Thanks.
> 
> George
> 
> Joel Fisher wrote:
> >
> > Not sure if I missed this point in the thread, but as long as you're 
> > cloning all or most of the ssids on a volume you should get nearly maximum 
> > drive speed on cloning.  My cloning runs anywhere between 20-80Mbs, because 
> > I clone full volumes(or all !incomplete,!suspect ssids).  From what I see 
> > it doesn't do any de-multiplexing, it just reads data from list of ssids 
> > you passed to it and writes it in the same order.
> >
> > I started full scale cloning just about 2 months ago and it has already 
> > saved my butt.  It turned out we had a firmware issue with our drives, but 
> > I never got any errors until we attempted to read the data.  Now, because 
> > of the cloning we are attempting to read every single saveset we back up so 
> > I started seeing a bunch of media errors.  After a couple of weeks of 
> > trouble shooting we wound up putting the latest firmware on the drives, now 
> > all is well.  Moral of the story is... if I hadn't been cloning I wouldn't 
> > have known I wasn't getting good backups until someone needed a restore.  
> > Not a good time to find that out.
> >
> > Joel
> >
> > -----Original Message-----
> > From: Teresa Biehler [mailto:tpbsys AT RIT DOT EDU]
> > Sent: Friday, April 18, 2003 3:45 PM
> > To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> > Subject: Re: [Networker] Merits of cloning versus dual backups?
> >
> > Cloning verifies that the original copy is good, but not that the clone
> > is.  So, which copy do you send off site?  Do you send the original
> > because you know that copy is verified?
> >
> > -Teresa
> >
> > Terry Lemons wrote:
> > >
> > > AMEN to that.  In fact, I used cloning for two purposes:
> > > o create a second copy for offsite storage
> > > o validate that the original copy is good
> > >
> > > In my experience, there is no other way to accomplish that second goal,
> > > except by cloning.
> > >
> > > tl
> > >
> > > Terry Lemons
> > > > CLARiiON Applications Integration Engineering
> > >         EMC²
> > > where information lives
> > >
> > > 4400 Computer Drive, MS D239
> > > Westboro MA 01580
> > > Phone: 508 898 7312
> > > Email: Lemons_Terry AT emc DOT com
> > >
> > > -----Original Message-----
> > > From: Wood, R A (Bob) [mailto:WoodR AT CHEVRONTEXACO DOT COM]
> > > Sent: Thursday, April 17, 2003 10:04 AM
> > > To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> > > Subject: Re: [Networker] Merits of cloning versus dual backups?
> > >
> > > The thing to note is, when you make a clone, you are validating the
> > > original backup. This could be priceless.
> > >
> > > Bob
> > >
> > > --
> > > Note: To sign off this list, send a "signoff networker" command via email
> > > to listserv AT listmail.temple DOT edu or visit the list's Web site at
> > > http://listmail.temple.edu/archives/networker.html where you can
> > > also view and post messages to the list.
> > > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> >
> > --
> > Note: To sign off this list, send a "signoff networker" command via email
> > to listserv AT listmail.temple DOT edu or visit the list's Web site at
> > http://listmail.temple.edu/archives/networker.html where you can
> > also view and post messages to the list.
> > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> >
> > --
> > Note: To sign off this list, send a "signoff networker" command via email
> > to listserv AT listmail.temple DOT edu or visit the list's Web site at
> > http://listmail.temple.edu/archives/networker.html where you can
> > also view and post messages to the list.
> > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> 
> --
> Note: To sign off this list, send a "signoff networker" command via email
> to listserv AT listmail.temple DOT edu or visit the list's Web site at
> http://listmail.temple.edu/archives/networker.html where you can
> also view and post messages to the list.
> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>