Networker

Re: [Networker] Some preliminary comments on 7.4

2007-09-20 11:12:05
Subject: Re: [Networker] Some preliminary comments on 7.4
From: michael mcgearty <michaelmcgearty AT GMAIL DOT COM>
To: NETWORKER AT LISTSERV.TEMPLE DOT EDU
Date: Thu, 20 Sep 2007 16:05:18 +0100
Thanks for the info on this,

Just  a few notes on our experance on upgrading from 7.1.4 to 7.4.
We have an storage-tek L5500 library with 28 lto 2 tape drives and use
DDS for storage nodes, all on solaris 9.

After using jbconfig to recreate the jb for for the new build we could
not get any tapes to load into any of the drives.
Logged a P1 with EMC and they could not resolve our problem after 12
hours so we had to roll back.

Still on 7.1.4 :)

We logged a call with SUN on the off chance it was a Solaris issue and
they said they seen this before and it was an issue with DDS. There is
a bug in jbconfig that does not allow you create DDS drives (for our
library at least). We had to create 28 phyical drives and then use
jbedit to add the DDS drives.

This fixed the probem in test and going to try our live system next
week, fingers crossed.

On 19/09/2007, Stan Horwitz <stan AT temple DOT edu> wrote:
> Well, its been a few hours short of a week of my upgrading our
> NetWorker server from 7.2.1 to 7.4. This is on a Solaris 9 box with a
> mixture of Solaris, Linux, Windows whatever, and Mac OS X clients,
> about 300 clients in all. My data zone also handles MS SQL, NDMP, and
> Microsoft Cluster Exchange backups.
>
> My experience with 7.4 is bitter sweet thus far. In a lot of ways, I
> am really impressed with 7.4 but I do have a couple of serious
> problems that give me reason for considerable concern.
>
> First, I discovered that NDMP backups work a little bit differently
> in a three-way environment. The password for the system that hosts
> the tape library robot needs a password to be entered into the
> storage node resource, which is new to me.
>
> Second, and of great concern is that there is definitely a bug in how
> the media database is managed. This problem has been escalated to an
> EMC NetWorker PSE and I have it at severity 1, although I initiated
> the case as severity 2. The problem at first appeared to be that
> NetWorker doesn't appear to handle automated tape cleaning properly.
> Specifically, it seems that with all 14 of my Sony PetaSite's S-AIT
> drives set to not use the CDI interface and with each device set for
> a daily cleaning interval, NetWorker keeps attempting to clean
> devices that are in use (i.e., reading or writing) and flooding me
> with emails that those devices were successfully cleaned even though
> that's impossible. Fortunately, NSR is not decrementing the number of
> cleaning uses on the 11 cleaning tapes I keep in the library and it
> is also cleaning drives that do need to be cleaned.
>
> Enabling the CDI feature on each device causes each drive that needs
> to be cleaned, to be cleaned twice, which will subject the drives to
> unnecessary wear and tear. Unfortunately, enabling our PetaSite to do
> auto-cleaning and turning off NSR's auto-cleaning shows that NSR and
> the PetaSite don't play well with that configuration. This is why I
> use NSR's auto-cleaning feature.
>
> I noticed that problem last Thursday and I opened up a case with EMC
> right away, but this weekend, I also discovered another problem which
> I am sure is connected. The NetWorker Management Console and "nsrjb"
> do not agree on which tapes are recyclable. At this time, the NWMC's
> media window shows four recyclable tapes while "nsrjb -C | grep yes"
> shows no recyclable tapes. When NetWorker attempts to label one of
> those tapes, it gets into a loop and keeps attempting to label one
> until I manually label another tape.
>
> I think this media database discrepancy was brought about by a
> problem where I tried to mark a tape as recyclable using the
> NetWorker Management Console's GUI and the marking process never
> finished. I had to reboot my workstation in order to free up NWMC.
> Now, whenever I attempt to mark a tape as recyclable or remove a tape
> from the media database, I get an error that says
>
> "39078:nsrmm: RAP error: Mark volume operation already in progress"
>
> I get this error if I use "nsrmmd" at the command line and when I try
> it via the NWMC GUI. The only way to resolve this issue is to restart
> NetWorker's daemons, which I suspect clears out its jobs database. I
> also noticed that if I try to load a cleaning tape manually by
> issuing an "nsrjb -l " command, the nsrmmgd process core dumps, then
> restarts a minute or two later. Resetting the tape library and doing
> an inventory of it doesn't help.
>
> I spent several hours with an EMC engineer on the phone and web exing
> yesterday, so I am hopeful that between the information that was
> gleaned from that session and from all the support files I sent, that
> a solution will be forthcoming soon ... I hope!
>
> Third, I just discovered a few minutes ago a tiny bug that's of
> miniscule consequence. The bug is that in the monitoring window in
> the sessions section, the start time for at least one of my save
> streams is reported as 5:15 AM while the actual start time, as
> reported in the groups section is 5:15 PM. I just noticed this issue,
> so I have not reported it to EMC.
>
> I also am having that problem with truncated savegroup reports;
> however, for me, its a minor issue because I find the NWMC GUI to
> offer enough information to allow me to see what went on with each
> savegroup's backups. I do intend to apply the fix for that, but its
> not among my top priorities. I also had some reporting scripts that
> no longer work; however, in my case, they are all obsoleted by the
> NWMC's GUI and I knew they would fail prior to upgrading to 7.4.
>
> Believe it or not, I like NetWorker 7.4 and I feel that EMC is being
> responsive to my requests for assistance. I started to think about
> ditching 7.4 and going with the latest 7.3.3, but I am going to give
> EMC a chance to resolve the issues I cited in this message. I have
> experienced problems with earlier versions that were much more
> difficult to troubleshoot than this media management problem. I also
> really like the new NWMC GUI. I have demonstrated the new GUI to
> several colleagues and they are all impressed with it. After I
> migrate to a new Sun T2000 with more disk capacity, I intend to
> enable all the report tracking feature. Right now, my Sun Fire V480
> doesn't have enough disk capacity to support the reporting I want to
> do, but that problem will be solved in another month or so when I
> upgrade hardware.
>
> --
> Stan Horwitz
> Temple University
> Enterprise Systems Group
> stan AT temple DOT edu
>
> CONFIDENTIALITY STATEMENT: The information contained in this e-mail,
> including attachments, is the confidential information of, and/or is
> the property of, Temple University. The information is intended for
> use solely by the individual or entity named in the e-mail. If you
> are not an intended recipient or you received this in error, then any
> review, printing, copying, or distribution of any such information is
> prohibited. Please notify the sender immediately by reply e-mail and
> then delete this e-mail from your system.
>
> To sign off this list, send email to listserv AT listserv.temple DOT edu and 
> type "signoff networker" in the body of the email. Please write to 
> networker-request AT listserv.temple DOT edu if you have any problems with 
> this list. You can access the archives at 
> http://listserv.temple.edu/archives/networker.html or
> via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
>


-- 
Regards
Michael McGearty

To sign off this list, send email to listserv AT listserv.temple DOT edu and 
type "signoff networker" in the body of the email. Please write to 
networker-request AT listserv.temple DOT edu if you have any problems with this 
list. You can access the archives at 
http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER

<Prev in Thread] Current Thread [Next in Thread>