Networker

Re: [Networker] aborted savesets when cloning?

2003-05-09 11:03:21
Subject: Re: [Networker] aborted savesets when cloning?
From: Robert Maiello <robert.maiello AT MEDEC DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Fri, 9 May 2003 11:03:24 -0400
I do quite a bit of cloning with 6.1.2 on Solaris.  I don't seem to have
this issue..at least on my clone tapes.  It seems like they should be hard
to find later as nsrck will remove or clean them up?

I did have some backup tapes that weren't recycling.. when listing them is
was seen that there were savesets on them that were listed as aborted or
in-progress.  I thought nsrck was suppose to clean these up but I guess
some manual running of nsrck with options and nsrim are needed every so
often.   I simple recycled these tapes.   I'm not showing any now.

The indication that this is a 6.1.3 issue is disturbing.  That version is
always reccommended to me on my support calls...it should be the one with
the most fixed issues ...sigh.

Robert Maiello
Thomson Medical Economics

On Fri, 9 May 2003 10:23:35 +0200, Joaquin Camp <joaquin.camp AT PROACT DOT SE> 
wrote:

>  Hi George,
>
>  Yes, this is a known issue by now.  We have some customers that have
>this issue.  We have reproduced the issue in our lab environment.
>Legato is informed and working on a fix for this.
>This is what we have sorted out until know:
>
>Savesets marked with ssflags "ca" (complete aborted)
>The best way to prevent savesets being marked "ca", is to run Networker
>6.1.2. This includes the clients as well.
>
>
>When do they occour?...
>They occour when stageing or cloning savesets from disk to tape. All
>versions of Networker pre 6.x, are affected. And all BSM that make 2GB
>chunks.
>
>We have seen this problem even with Networker 6.x. And we can see it for
>sure, with Networker 6.1.3. Therefore, I recommend to install 6.1.2 for
>best results.
>
>NetWare clients are affected aswell. Even if the Networker software is
>"new", it hasen't being major changes in the Networker software for
>NetWare. Therefore it works as a pre 6.x, which the release version also
>tells.
>
>
>What could happen?...
>The tests runed with savesets marked as "ca", shows that we have been
>able to recover, both individual files and whole savesets. So even if
>this seems to be more of a cosmetic issue. Big problems can occour.
>Examples: The retention policy's are not longer valid on this savesets.
>Therefore the volumes can be overwritten before they should. You will
>not longer be able to clone/stage does savesets once they are marked
>with "ca".
>
>  My recommendation, is to open a support call to Legato.  That way they
>got to speed up the process of making a working fix.
>
>  Thanks and have a nice day!
>/Joaquin Camp.
>
>
>On Thu, 2003-05-08 at 17:29, George Sinclair wrote:
>> When I say reported as "aborted" I mean running something like:
>>
>> 'mminfo -av -s server clone_volume_name'
>>
>> shows them as having an "aborted" status. Also, they show up in the
>> volumes window with a status of "aborted". You may be right, though,
>> because it did in fact run out and was requesting another tape, but
>> here's where I have a problem with this theory as the culprit. The drive
>> has a max sessions value of 4. There were 10 pathnames listed in my
>> input file. The first 5 are rather small, about 4 GB, and the next 5 are
>> much larger at around 30 GB. The first 5 completed with no problems. I
>> know that because these all were listed with a status of "recoverable"
>> by the time the cloning process began the first of the big guns. Now,
>> this was a brand new LTO tape with 100 GB of native capacity. 5 savesets
>> at 4 GB each only adds up to 20 GB. That still leaves at least 80 GB
>> remaining. I have a hard time believing that the next saveset at 30 GB
>> could not fit on that tape. It should have, in which case the space
>> problem should not have been an issue with the first of the large
>> savesets, so I wouldn't have expected this to be a cause of the problem.
>> On the other hand, I do only see 4 of the larger savesets listed for the
>> volume, along with the 5 small ones for a total of 9. Obviously, it
>> never started on the 5th large one. So, it would appear that all 4 of
>> the larger ones were multi-plexed to the tape, in which case I guess it
>> could not finish any "one" before it ran out of space. Maybe I'll try
>> again, and only list two of the large savesets and see what happens. I'm
>> sure you're right about the "out of space" theory causing the aborted
>> problem, but since I'd seen this before when cloning several small
>> savesets, where space on the tape was never an issue, I became
>> concerned. I find it odd that NetWorker doesn't report these as
>> "in-progress" or something more meaningful.
>>
>> Thanks for your response! I will re-test.
>>
>> George
>>
>> Carl Farnsworth wrote:
>> >
>> > What do you mean by "reported as aborted"?  You also mentioned the tape is
>> > filled up.  Is it possible the clone session is still active and waiting
>> > for a second tape?
>> >
>> > I also got confused by this my first time using scripted cloning.  The
>> > volumes disply GUI for the clone volume will show an "a" flag for the
>> > savesets that are currently being cloned.  At first I thought this
>> > meant "a" for active, but I later realized that NetWorker is setting the
>> > flag as "a" for aborted, until it's succesfully completed!
>> >
>> > (P.S.  This is how I first realized that the cloning operation was not de-
>> > multiplexing the savesets).
>> >
>> > HTH
>> > Carl Farnsworth
>> > DigiDyne Inc.
>> >
>> > On Tue, 6 May 2003 16:58:11 -0400, George Sinclair
>> > <George.Sinclair AT NOAA DOT GOV> wrote:
>> >
>> > >Hi,
>> > >
>> > >I've noticed that whenever I create a file containing a list of ssids,
>> > >and I run the clone command as:
>> > >
>> > >nsrclone -s server -b 'Clone Pool Name' -S -f input_file
>> > >
>> > >or if I pass them in as:
>> > >
>> > >nsrclone -s server -b 'Clone Pool Name' -S ssid1 ssid2 ....
>> > >
>> > >then the operation works, but there's always one, and sometimes several,
>> > >that are reported as "aborted". Deleting them and re-running them
>> > >wouldn't be so bad except that in this case, they're like 30 GB
>> > >savesets! I can't reclaim my space on the tape. The last time I tried to
>> > >clone 9 savesets, 5 of them were aborted when I came back. I'm going to
>> > >have to just end up re-labeling the tape because it's filled up, and the
>> > >only ones that were not aborted are only like 7 GB each, for a total of
>> > >maybe 35 Gb. Not really worth sacrificing the whole tape just for those
>> > >few. Anyone have any ideas what causes this abort business?
>> > >
>> > >I notice that if I run them one at a time, I don't see this problem. I'm
>> > >not seeing problems during backups, and I'm not seeing errors on the
>> > >devices.
>> > >
>> > >Thanks.
>> > >
>> > >George
>> >
>> > --
>> > Note: To sign off this list, send a "signoff networker" command via email
>> > to listserv AT listmail.temple DOT edu or visit the list's Web site at
>> > http://listmail.temple.edu/archives/networker.html where you can
>> > also view and post messages to the list.
>> > =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>>
>> --
>> Note: To sign off this list, send a "signoff networker" command via email
>> to listserv AT listmail.temple DOT edu or visit the list's Web site at
>> http://listmail.temple.edu/archives/networker.html where you can
>> also view and post messages to the list.
>> =*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list.
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list.
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=