Networker

Re: [Networker] W2k Client seems to be running but does nothing

2004-11-22 15:54:51
Subject: Re: [Networker] W2k Client seems to be running but does nothing
From: Conrad Macina <conrad.macina AT PFIZER DOT COM>
To: NETWORKER AT LISTMAIL.TEMPLE DOT EDU
Date: Mon, 22 Nov 2004 15:54:06 -0500
Active sessions that are not backing anything up appears to be a common
problem, and is present in 7.1.1 as well as 6.1.3. The "fix" is -- as you
have discovered -- to kill the nsrexec process. Sometimes that will make
the savegroup finish successfully. Sometimes, particularly if the client
has stopped pinging during the backup, another nsrexec will spawn, hang and
have to be killed until you've exceeded the retry count. And sometimes
another nsrexec will spawn and start backing up happily.

I have a script that compares the groups that are listed as active in their
respective savegroup resources with those that are actively backing up in
the nsr resource. You can run it once a day (or at other intervals) and
kill the PIDs it reports.

Good luck!

Usual disclaimers: use at your own risk, no representations, not guaranteed
to work in your environment, no support, not a great example of the
scriptwriter's art, and apologies for the sparse comments -- it was written
in the usual hurry.

Plus one UNusual disclaimer: It's perfectly normal for this script to
report a discrepancy shortly after a savegroup starts, and also when there
are outstanding tape mount requests. I'll update it to check for
outstanding mounts ... someday.

Conrad Macina
Pfizer, Inc.

#!/bin/ksh

# List running savegroups & reconcile pending save sets against active
sessions

GRPTMP=/tmp/`basename $0`.grp
SESTMP=/tmp/`basename $0`.ses1          ;# Temp files

for F in $GRPTMP $SESTMP
do
  if [ -f $F ] ;then
     rm $F                              ;# Make sure files don't exist to
start
  fi
done

# Get NetWorker's Session List
echo "show session\nprint type:nsr" | nsradmin -i - | nawk '
{
  while (substr($0,length($0),1) == "\\") {
    getline PART2 ;$0 = substr($0,1,length($0)-1) PART2
    }
  print
  }' | sed 's!^ *session: *!!' | sed 's![,;] *$!!' | tr -d '"' | cut -
d ":" -f 1 > $SESTMP

# echo "Session list:"
# cat $SESTMP

# Get the list of active groups and clients
for GRP in `echo "show name\nprint type:nsr group;status:running"
| /usr/sbin/nsradmin -i - | grep -v '^$' | awk '{print $NF}' | tr -d ';'`
do
  echo $GRP >> $GRPTMP
  GRPP=`echo "$GRP" | sed 's!:!\\\:!g'` >> $GRPTMP
  echo "show work list\nprint type:nsr group;name:$GRPP"
| /usr/sbin/nsradmin -i - | egrep -v '^$|work list: ;' >> $GRPTMP
done
# echo "\nGroup list:"
# cat $GRPTMP

cat $GRPTMP | nawk '
/^[a-zA-Z-_:\"]/ {GRP=$1}
/^ +work list: / {
   X=index($0,":")
   Y=index($0,",")
   CLI=substr($0, X+2, Y-X-2)
   CMD="grep -c "CLI" '$SESTMP'"
   CMD | getline FLAG
   close (CMD)
   if (FLAG == 0) {
      printf "%s~%s\n", CLI, GRP
      }
} ' | while read CLIGRP
do
  CLI=`echo $CLIGRP | cut -d "~" -f 1`
  GRP=`echo $CLIGRP | cut -d "~" -f 2`
  PID=`ps -ef | grep -v grep | grep "nsrexec.*$CLI" | awk '{printf "%s ",
$2}'`
  echo "Client: $CLI, Group: $GRP, PID(s): $PID"
done





On Fri, 19 Nov 2004 10:44:45 -0500, Rafael Bolivar <rafael.bolivar AT BULL DOT 
ES>
wrote:

>The information I provided was erroneus. Instead of nsrmmd, there are
>nsrexec processes.
>
>Sorry...
>
>   Rafa.
>
>On Thu, 18 Nov 2004 08:22:38 -0500, Rafael Bolivar
><rafael.bolivar AT BULL DOT ES> wrote:
>
>>Hello,
>>
>>we are experiencing some trouble with a W2k client SP4 running Legato
>>NetWorker 7.1.1. It runs a scheduled backup every day at 02:00 AM and 2
>>days ago it wasn't able to finish the backup. It started saving at 02:00
>as
>>normally, but when we arrived in the morning it hasn't finished the backup
>>(usually it finishes about 05:00). It wasn't saving, and there were no
>>errors or timeouts. We also realize that there were some nsrmmd processes
>>running in the server, so we killed them and the backup finished. Is there
>>any way to solve it?
>>
>>Some usefull data:
>>
>>Server:
>>NetWorker 7.1.1 Power Edition
>>AIX 5.2
>>
>>Client:
>>
>>NetWorker client 7.1.1
>>Windows 2000 SP4
>>
>>Thanks in advance,
>>
>>      Rafa.
>>
>>--
>>Note: To sign off this list, send a "signoff networker" command via email
>>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>>http://listmail.temple.edu/archives/networker.html where you can
>>also view and post messages to the list. Questions regarding this list
>>should be sent to stan AT temple DOT edu
>>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
>
>--
>Note: To sign off this list, send a "signoff networker" command via email
>to listserv AT listmail.temple DOT edu or visit the list's Web site at
>http://listmail.temple.edu/archives/networker.html where you can
>also view and post messages to the list. Questions regarding this list
>should be sent to stan AT temple DOT edu
>=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

This appears to be a common problem. The "fix" is to kill the nsrexec

--
Note: To sign off this list, send a "signoff networker" command via email
to listserv AT listmail.temple DOT edu or visit the list's Web site at
http://listmail.temple.edu/archives/networker.html where you can
also view and post messages to the list. Questions regarding this list
should be sent to stan AT temple DOT edu
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=

<Prev in Thread] Current Thread [Next in Thread>