Well, its been a few hours short of a week of my upgrading our
NetWorker server from 7.2.1 to 7.4. This is on a Solaris 9 box with a
mixture of Solaris, Linux, Windows whatever, and Mac OS X clients,
about 300 clients in all. My data zone also handles MS SQL, NDMP, and
Microsoft Cluster Exchange backups.
My experience with 7.4 is bitter sweet thus far. In a lot of ways, I
am really impressed with 7.4 but I do have a couple of serious
problems that give me reason for considerable concern.
First, I discovered that NDMP backups work a little bit differently
in a three-way environment. The password for the system that hosts
the tape library robot needs a password to be entered into the
storage node resource, which is new to me.
Second, and of great concern is that there is definitely a bug in how
the media database is managed. This problem has been escalated to an
EMC NetWorker PSE and I have it at severity 1, although I initiated
the case as severity 2. The problem at first appeared to be that
NetWorker doesn't appear to handle automated tape cleaning properly.
Specifically, it seems that with all 14 of my Sony PetaSite's S-AIT
drives set to not use the CDI interface and with each device set for
a daily cleaning interval, NetWorker keeps attempting to clean
devices that are in use (i.e., reading or writing) and flooding me
with emails that those devices were successfully cleaned even though
that's impossible. Fortunately, NSR is not decrementing the number of
cleaning uses on the 11 cleaning tapes I keep in the library and it
is also cleaning drives that do need to be cleaned.
Enabling the CDI feature on each device causes each drive that needs
to be cleaned, to be cleaned twice, which will subject the drives to
unnecessary wear and tear. Unfortunately, enabling our PetaSite to do
auto-cleaning and turning off NSR's auto-cleaning shows that NSR and
the PetaSite don't play well with that configuration. This is why I
use NSR's auto-cleaning feature.
I noticed that problem last Thursday and I opened up a case with EMC
right away, but this weekend, I also discovered another problem which
I am sure is connected. The NetWorker Management Console and "nsrjb"
do not agree on which tapes are recyclable. At this time, the NWMC's
media window shows four recyclable tapes while "nsrjb -C | grep yes"
shows no recyclable tapes. When NetWorker attempts to label one of
those tapes, it gets into a loop and keeps attempting to label one
until I manually label another tape.
I think this media database discrepancy was brought about by a
problem where I tried to mark a tape as recyclable using the
NetWorker Management Console's GUI and the marking process never
finished. I had to reboot my workstation in order to free up NWMC.
Now, whenever I attempt to mark a tape as recyclable or remove a tape
from the media database, I get an error that says
"39078:nsrmm: RAP error: Mark volume operation already in progress"
I get this error if I use "nsrmmd" at the command line and when I try
it via the NWMC GUI. The only way to resolve this issue is to restart
NetWorker's daemons, which I suspect clears out its jobs database. I
also noticed that if I try to load a cleaning tape manually by
issuing an "nsrjb -l " command, the nsrmmgd process core dumps, then
restarts a minute or two later. Resetting the tape library and doing
an inventory of it doesn't help.
I spent several hours with an EMC engineer on the phone and web exing
yesterday, so I am hopeful that between the information that was
gleaned from that session and from all the support files I sent, that
a solution will be forthcoming soon ... I hope!
Third, I just discovered a few minutes ago a tiny bug that's of
miniscule consequence. The bug is that in the monitoring window in
the sessions section, the start time for at least one of my save
streams is reported as 5:15 AM while the actual start time, as
reported in the groups section is 5:15 PM. I just noticed this issue,
so I have not reported it to EMC.
I also am having that problem with truncated savegroup reports;
however, for me, its a minor issue because I find the NWMC GUI to
offer enough information to allow me to see what went on with each
savegroup's backups. I do intend to apply the fix for that, but its
not among my top priorities. I also had some reporting scripts that
no longer work; however, in my case, they are all obsoleted by the
NWMC's GUI and I knew they would fail prior to upgrading to 7.4.
Believe it or not, I like NetWorker 7.4 and I feel that EMC is being
responsive to my requests for assistance. I started to think about
ditching 7.4 and going with the latest 7.3.3, but I am going to give
EMC a chance to resolve the issues I cited in this message. I have
experienced problems with earlier versions that were much more
difficult to troubleshoot than this media management problem. I also
really like the new NWMC GUI. I have demonstrated the new GUI to
several colleagues and they are all impressed with it. After I
migrate to a new Sun T2000 with more disk capacity, I intend to
enable all the report tracking feature. Right now, my Sun Fire V480
doesn't have enough disk capacity to support the reporting I want to
do, but that problem will be solved in another month or so when I
upgrade hardware.
--
Stan Horwitz
Temple University
Enterprise Systems Group
stan AT temple DOT edu
CONFIDENTIALITY STATEMENT: The information contained in this e-mail,
including attachments, is the confidential information of, and/or is
the property of, Temple University. The information is intended for
use solely by the individual or entity named in the e-mail. If you
are not an intended recipient or you received this in error, then any
review, printing, copying, or distribution of any such information is
prohibited. Please notify the sender immediately by reply e-mail and
then delete this e-mail from your system.
To sign off this list, send email to listserv AT listserv.temple DOT edu and type
"signoff networker" in the body of the email. Please write to networker-request
AT listserv.temple DOT edu if you have any problems with this list. You can access the
archives at http://listserv.temple.edu/archives/networker.html or
via RSS at http://listserv.temple.edu/cgi-bin/wa?RSS&L=NETWORKER
|