Veritas-bu

[Veritas-bu] Jobs stuck in the queue?

2005-09-30 14:22:20
Subject: [Veritas-bu] Jobs stuck in the queue?
From: Greg.Hindle AT constellation DOT com (Hindle, Greg)
Date: Fri, 30 Sep 2005 14:22:20 -0400
Ohh ok so this will cancel ANY job that is running over 24 hours? Not
just those that are not pushing data? 


 
Greg
 

-----Original Message-----
From: Mark.Donaldson AT cexp DOT com [mailto:Mark.Donaldson AT cexp DOT com] 
Sent: Friday, September 30, 2005 2:04 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Cc: Hindle, Greg
Subject: RE: [Veritas-bu] Jobs stuck in the queue?

I had a request to post my fix to auto-cancel these System_State jobs so
here it is below.  Note, this works off elapsed time of the job, which
starts counting when the job queues.  I'd rather work off the "attempt
elapsed" time but bpdbjobs doesn't seem to kick out that number in any
useful way.

As written, it cancels any active /System_State/ job with an elapsed
time over 24 hours.  Change the LOG file path to fit your environment.
I stuck mine in cron on a six-hour cycle.

==== Script start ====

#!/bin/ksh

# Kills stuck NT System_State backups.
# Mark Donaldson - 09/30/2005

# Max elapsed time for System_State backup (seconds) maxtime=86400

PATH=$PATH:/usr/openv/netbackup/bin/admincmd
PROG=`basename $0`
LOG=/usr/openv/netbackup/logs/scripts/$PROG.log
TMP=/tmp/$PROG.tmp

# Logfile Management
exec >>$LOG 2>&1
if [ `wc -l $LOG | awk '{print $1}'` -gt 2000 ] then
  cp $LOG $TMP
  echo "Logfile Truncated: `date`" >$LOG
  tail -1000 $TMP >>$LOG
  rm -f $TMP
fi

## Main
for jobid in ` bpdbjobs -most_columns | \
    awk -F, '$2==0 && $3==1 && $10>'$maxtime' \
    && $17~/^\/System_State\// && $22==13 {print $1}'` do
  echo "Cancelling $jobid :: `date`"
  bpdbjobs -cancel $jobid
done
exit





-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
Mark.Donaldson AT cexp DOT com
Sent: Thursday, September 29, 2005 9:08 AM
To: jpiszcz AT servervault DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Jobs stuck in the queue?


I'm getting the \system state\ ones stuck frequently but a "cancel" from
the gui clears them.  If I check the details screen on the GUI, it show
the tape moving from fragment to fragment but no data moving.

I'm getting ready to write something that looks for these & kill them if
they've been running more than 12 hours or so.  

I'm waiting for another one so I can grab the bpdbjobs entry.

-M
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Piszcz,
Justin
Sent: Thursday, September 29, 2005 8:41 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Jobs stuck in the queue?


When using ALL_LOCAL_DRIVES (w/ NEW_STREAM) under it, sometimes I get
some jobs (mainly System_State:\ or Shadow_Copy_Components:\) that get
stuck in the queue.
 
There is no way to delete them except stop and start the NB server
processes.
 
Anyone ever experience anything like that before?
 
Using 5.1mp3a on Solaris 8.

>>> This e-mail and any attachments are confidential, may contain legal, 
>>> professional or other privileged information, and are intended solely for 
>>> the addressee.  If you are not the intended recipient, do not use the 
>>> information in this e-mail in any way, delete this e-mail and notify the 
>>> sender. CEG-IP1


<Prev in Thread] Current Thread [Next in Thread>