Networker

Re: [Networker] Multiplexing and demultiplexing on clones?

2003-05-02 15:22:37
Subject: Re: [Networker] Multiplexing and demultiplexing on clones?
From: George Sinclair <George.Sinclair AT NOAA DOT GOV>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 2 May 2003 15:22:33 -0400
This clears it up once and for all! Thanks Terry and all others who
responded.

George

> Terry Clayton wrote:
>
> hi George,
>
> Sorry I thought I had cleared things up for you.  Regarding your
> comment "The horse's voice sounds different after it's been translated
> by five farmers."   I'm Director of QA in Engineering, previously
> Director of Networker Core Engineering and the info I provided to you
> came directly from the system architect of the Networker core team.
>
> Just in case any doubt remains ......
> Carl Farnsworth is right in saying
> "If nsrclone is given a LIST of savesets to clone, regardless of
> whether it's -S ssid1 ssid2 ssid3, or the -f filename option, or
> auto-cloning after savegrp, it will read through the tape, and
> whenever it sees an ssid on the list, will start to clone it.  It'll
> grab the first chunk of ssid1; if the next chunk belongs to ssid2,
> it'll start cloning that saveset, etc.  If the next chunk belongs to a
> saveset NOT on the list, it simply skips it."
>
> so is Rich Mangel
> "If the data is multiplexed on the original tape, and the ssid's that
> are multiplexed are passed to a single nsrclone process, then that
> data will be multiplexed on the destination clone tape.
>
> In order to 'de-multiplex' the data on the destination clone tape, the
> ssid's will have to be passed to individual nsrclone processes - and
> each nsrclone process will need to make a scan of the volume,
> resulting in reading the original volume multiple times.
>
> ..as is Steve Barber
> "I think you're making it more complicated than it really is.  Here's
> how I think of it:
>
>  - data is written to tape one chunk at a time.  Each chunk is part of
> one savestream, i.e. from a single client.
>  - when multiplexing is occurring it takes a chunk from each active
> savestream in turn and writes them to tape.
>
>  When you're cloning, it reads the tape serially.  It has to read each
> chunk off of the source tape and check to see if it's a saveset it's
> supposed to be cloning.  If it is, it simply copies it to the
> destination
>
>  tape.
>
>  So I assume it would simply start with the lowest numbered SSID that
> it wants to clone and start copying those chunks to the destination
> tape.
>
>  However, if in the course of copying those chunks it runs into chunks
> of  other SSIDs that it wants to copy, it probably just starts copying
> those also, thus some degree of multiplexing would be maintained.
>
>  Chunks that are not to be cloned would be left out, of course.
> .........
>
>  The evidence that was presented seemed pretty conclusive to me that
> multiplexing was maintained (at least partially) during cloning, and
> if it really did demultiplex during cloning, you would get horrible
>
>  cloning throughput.  "
>
> > -----Original Message-----
> > From: George Sinclair [mailto:George.Sinclair AT noaa DOT gov]
> > Sent: Friday, May 02, 2003 9:14 AM
> > To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
> > Subject: Re: [Networker] Multiplexing and demultiplexing on clones?
> >
> >
> > That makes perfect sense! I think if the savesets you were
> > cloning were
> > not anywhere near in proximity on the same tape, or if each was
> > completely located on a separate tape and was not a
> > continuation then I
> > think the multiplexing would be minimal or not at all, but it
> > does make
> > sense that if two or more of the ssids that you're cloning
> > are together
> > on the source tape then the interleaved nature would be preserved in
>
> > some manner, and clearly, some multiplexing must be occuring when
> the
> > ssids are written to the clone as you remarked. One thing I
> > noticed, is
> > that the devices window at one point jumped from: "cloning 2 of 15"
> to
> > "cloning 4 of 15", with no mention of "3 of 15".  I saw this also at
>
> > like "12 of 15" with no "13" of 15", but the end results all
> > show 15 of
> > 15. Everything is there. Hmm...
> >
> > Probably the reason why no conclusive information has been
> > obtained from
> > Legato is because there are probably very few people there who truly
>
> > understand the code and how it works on a lower level. I think you'd
>
> > really need to speak to an engineer who knows the software on that
> > level, and they do exist, but you're unlikely to talk to such a
> person
> > because customer support normally always routes you through an outer
>
> > shell that moves inward only as need be. Getting someone who really
> > knows this stuff is not easy because most of them may not work in
> > customer support, so the information has to be relayed through
> various
> > people. The horse's voice sounds different after it's been
> > translated by
> > five farmers.
> >
> > George
> >
> > Steve Barber wrote:
> > >
> > > I'm not posting to the list because I don't know the answer either
>
> > > but since I've done some coding and have some thoughts about how
> > > I would implement it if it were me, hopefully this is slightly
> > > better than just a guess...
> > >
> > > For the life of me I don't understand why the Legato people on the
>
> > > list won't speak up and clear the air about this issue once and
> > > for all; it's been an open question for years.
> > >
> > > On Thu, May 01, 2003 at 07:36:06PM -0400, George Sinclair wrote:
> > > > ...
> > > > I follow what you're saying if you're cloning everything
> > from say one
> > > > source tape, but if you're specifying different savesets
> > on the same
> > > > tape or different ones from different tapes then how
> > could NetWorker
> > > > preserve the interleaved nature of the data? It seems to
> > me that the
> > > > only way it could would be if it included the other stuff
> > that that
> > > > saveset is wrapped together with -- namely, the other
> > savestreams that
> > > > were multiplexed with it when it was originally backed
> > up. Now, since
> > > > the clone doesn't end up with those other savesets, how
> > could the data
> > > > still be interleaved or multiplexed on the tape the way
> > it was on the
> > > > source? I mean, if you said:
> > > >
> > > > nsrclone ssid1 ssid2 ssid3 ssid4 ssid5
> > > >
> > > > and ssid1-3 were multiplexed together on the source tape
> > then yeah, I
> > > > can see that ssid1-3 will now be multiplexed to the
> > clone, but if ssid4
> > > > and ssid5 are on separate places on the source tape and
> > were never saved
> > > > to the source tape at the same time (i.e., they were
> > never multiplexed
> > > > together when their savestreams were written) then NetWorker
> could
> > > > hardly interleave these with what ever they were
> > interleaved with on the
> > > > source tape since maybe whatever ssids they were
> > originally interleaved
> > > > with might not have been specified on the nsrclone command. In
> my
> > > > example, I'm going to assume that ssid4 cam from the same
> > tape but ssid5
> > > > did not. Maybe ssid4 was interleaved with ssid50-55 and ssid5
> was
> > > > interleaved with ssid 14-20.
> > > >
> > > > Perhaps you are suggesting that the savesets that do get
> > cloned are
> > > > themselves interleaved or multiplexed onto the clone
> > volume, and this
> > > > has nothing to do with the nature of the way those savesets were
>
> > > > originally laid out on the source tape or what they were
> > interleaved
> > > > with on the source tape?
> > > >
> > > > Perhaps I am confusing the term "interleaved" with the term
> > > > "multiplexed"?
> > >
> > > I think you're making it more complicated than it really
> > is.  Here's how
> > > I think of it:
> > >
> > > - data is written to tape one chunk at a time.  Each chunk
> > is part of one
> > >   savestream, i.e. from a single client.
> > > - when multiplexing is occurring it takes a chunk from each active
>
> > >   savestream in turn and writes them to tape.
> > >
> > > When you're cloning, it reads the tape serially.  It has to
> > read each
> > > chunk off of the source tape and check to see if it's a saveset
> it's
> > > supposed to be cloning.  If it is, it simply copies it to
> > the destination
> > > tape.
> > >
> > > So I assume it would simply start with the lowest numbered
> > SSID that it
> > > wants to clone and start copying those chunks to the
> > destination tape.
> > > However, if in the course of copying those chunks it runs
> > into chunks of
> > > other SSIDs that it wants to copy, it probably just starts copying
>
> > > those also, thus some degree of multiplexing would be maintained.
> > > Chunks that are not to be cloned would be left out, of course.
> > >
> > > This may be a simplistic view; for instance it can probably
> > figure out
> > > ahead of time which ones are multiplexed, and it may only pick up
> > > new SSIDs if it's starting at the beginning, i.e. it doesn't start
>
> > > seeing a new SSID in the middle of the stream.  (Like if
> > the 1st part
> > > of the stream had been written to a different tape.)
> > >
> > > It probably tracks what SSIDs it's currently cloning, and
> > as it reads
> > > through them that list will grow and shrink as it finds new
> > SSIDs and
> > > finishes them and/or finds new ones.  It probably just runs
> > until that
> > > list size reaches zero.  Then it removes those SSIDs that
> > it finished
> > > from its master list and starts again with the next lowest SSID.
> > >
> > > I think the key is realizing that it's a serial operation.
> > >
> > > > I know you guys pursued this to a much deeper level than
> > I did, with
> > > > things like record numbers and starting and ending file
> > number, etc. so
> > > > I'm not disagreeing but rather I just need some more
> > explanation here. I
> > > > must be looking at this all wrong.
> > >
> > > The evidence that was presented seemed pretty conclusive to me
> that
> > > multiplexing was maintained (at least partially) during cloning,
> and
> > > if it really did demultiplex during cloning, you would get
> horrible
> > > cloning throughput.  (And buffering the SSIDs you aren't actively
> > > copying would be prohibitive, since most of us don't have 100GB+
> > > of swap or even necessarily free disk space to buffer that
> > data onto!)
> > >
> > > > Do we agree, though, that NetWorker is cloning the
> > savesets one at a
> > > > time regardless of how you do it? I see the devices
> > window always shows
> > > > 1 of total, 2 of total, ... total of total.
> > >
> > > No, I don't think this is a safe assumption.  It could be that the
>
> > > number is just how many it has started or finished.  Also, I've
> seen
> > > the total number reset several times during the course of a large
> > > cloning operation so it just isn't clear to me what's being
> counted
> > > or how.
> > >
> > > Hope that helps!
> > >
> > > Steve
> >
> > --
> > Note: To sign off this list, send a "signoff networker"
> > command via email
> > to listserv AT listmail.temple DOT edu or visit the list's Web site at
> > http://listmail.temple.edu/archives/networker.html where you can
> > also view and post messages to the list.
> > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
> >

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=