Veritas-bu

[Veritas-bu] Jobs stuck in the queue?

2005-09-30 14:04:09
Subject: [Veritas-bu] Jobs stuck in the queue?
From: Mark.Donaldson AT cexp DOT com (Mark.Donaldson AT cexp DOT com)
Date: Fri, 30 Sep 2005 12:04:09 -0600
I had a request to post my fix to auto-cancel these System_State jobs so
here it is below.  Note, this works off elapsed time of the job, which
starts counting when the job queues.  I'd rather work off the "attempt
elapsed" time but bpdbjobs doesn't seem to kick out that number in any
useful way.

As written, it cancels any active /System_State/ job with an elapsed time
over 24 hours.  Change the LOG file path to fit your environment.  I stuck
mine in cron on a six-hour cycle.

==== Script start ====

#!/bin/ksh

# Kills stuck NT System_State backups.
# Mark Donaldson - 09/30/2005

# Max elapsed time for System_State backup (seconds)
maxtime=86400

PATH=$PATH:/usr/openv/netbackup/bin/admincmd
PROG=`basename $0`
LOG=/usr/openv/netbackup/logs/scripts/$PROG.log
TMP=/tmp/$PROG.tmp

# Logfile Management
exec >>$LOG 2>&1
if [ `wc -l $LOG | awk '{print $1}'` -gt 2000 ]
then
  cp $LOG $TMP
  echo "Logfile Truncated: `date`" >$LOG
  tail -1000 $TMP >>$LOG
  rm -f $TMP
fi

## Main
for jobid in ` bpdbjobs -most_columns | \
    awk -F, '$2==0 && $3==1 && $10>'$maxtime' \
    && $17~/^\/System_State\// && $22==13 {print $1}'`
do
  echo "Cancelling $jobid :: `date`"
  bpdbjobs -cancel $jobid
done
exit





-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
Mark.Donaldson AT cexp DOT com
Sent: Thursday, September 29, 2005 9:08 AM
To: jpiszcz AT servervault DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Jobs stuck in the queue?


I'm getting the \system state\ ones stuck frequently but a "cancel" from the
gui clears them.  If I check the details screen on the GUI, it show the tape
moving from fragment to fragment but no data moving.

I'm getting ready to write something that looks for these & kill them if
they've been running more than 12 hours or so.  

I'm waiting for another one so I can grab the bpdbjobs entry.

-M
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Piszcz, 
Justin
Sent: Thursday, September 29, 2005 8:41 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Jobs stuck in the queue?


When using ALL_LOCAL_DRIVES (w/ NEW_STREAM) under it, sometimes I get some
jobs (mainly System_State:\ or Shadow_Copy_Components:\) that get stuck in
the queue.
 
There is no way to delete them except stop and start the NB server
processes.
 
Anyone ever experience anything like that before?
 
Using 5.1mp3a on Solaris 8.

<Prev in Thread] Current Thread [Next in Thread>