Veritas-bu

[Veritas-bu] Jobs stuck in the queue?

2005-09-30 15:42:53
Subject: [Veritas-bu] Jobs stuck in the queue?
From: Mark.Donaldson AT cexp DOT com (Mark.Donaldson AT cexp DOT com)
Date: Fri, 30 Sep 2005 13:42:53 -0600
No - it doesn't kill ANY job.  It only kills NT backup jobs in active state
with "/System_State/" beginning their current active backup path.  What I
don't check is KB transferred as part of this.  KB transferred is field 15
(kbytes) and I didn't have good data in what the "stuck" System State jobs
look like on that.  Are they at zero?  I haven't had an actual stuck job
since the beginning of the week.

It'd be easy to add a check for $15==0 to the awk script if necessary.  My
assumption is that since /System_State/ is active and, hopefully, small
amount of data, then a 24-hour elapsed time would be sufficient.

-M

-----Original Message-----
From: Hindle, Greg [mailto:Greg.Hindle AT constellation DOT com]
Sent: Friday, September 30, 2005 12:22 PM
To: Mark.Donaldson AT cexp DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Jobs stuck in the queue?


Ohh ok so this will cancel ANY job that is running over 24 hours? Not
just those that are not pushing data? 


 
Greg
 

-----Original Message-----
From: Mark.Donaldson AT cexp DOT com [mailto:Mark.Donaldson AT cexp DOT com] 
Sent: Friday, September 30, 2005 2:04 PM
To: veritas-bu AT mailman.eng.auburn DOT edu
Cc: Hindle, Greg
Subject: RE: [Veritas-bu] Jobs stuck in the queue?

I had a request to post my fix to auto-cancel these System_State jobs so
here it is below.  Note, this works off elapsed time of the job, which
starts counting when the job queues.  I'd rather work off the "attempt
elapsed" time but bpdbjobs doesn't seem to kick out that number in any
useful way.

As written, it cancels any active /System_State/ job with an elapsed
time over 24 hours.  Change the LOG file path to fit your environment.
I stuck mine in cron on a six-hour cycle.

==== Script start ====

#!/bin/ksh

# Kills stuck NT System_State backups.
# Mark Donaldson - 09/30/2005

# Max elapsed time for System_State backup (seconds) maxtime=86400

PATH=$PATH:/usr/openv/netbackup/bin/admincmd
PROG=`basename $0`
LOG=/usr/openv/netbackup/logs/scripts/$PROG.log
TMP=/tmp/$PROG.tmp

# Logfile Management
exec >>$LOG 2>&1
if [ `wc -l $LOG | awk '{print $1}'` -gt 2000 ] then
  cp $LOG $TMP
  echo "Logfile Truncated: `date`" >$LOG
  tail -1000 $TMP >>$LOG
  rm -f $TMP
fi

## Main
for jobid in ` bpdbjobs -most_columns | \
    awk -F, '$2==0 && $3==1 && $10>'$maxtime' \
    && $17~/^\/System_State\// && $22==13 {print $1}'` do
  echo "Cancelling $jobid :: `date`"
  bpdbjobs -cancel $jobid
done
exit





-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of
Mark.Donaldson AT cexp DOT com
Sent: Thursday, September 29, 2005 9:08 AM
To: jpiszcz AT servervault DOT com; veritas-bu AT mailman.eng.auburn DOT edu
Subject: RE: [Veritas-bu] Jobs stuck in the queue?


I'm getting the \system state\ ones stuck frequently but a "cancel" from
the gui clears them.  If I check the details screen on the GUI, it show
the tape moving from fragment to fragment but no data moving.

I'm getting ready to write something that looks for these & kill them if
they've been running more than 12 hours or so.  

I'm waiting for another one so I can grab the bpdbjobs entry.

-M
-----Original Message-----
From: veritas-bu-admin AT mailman.eng.auburn DOT edu
[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu]On Behalf Of Piszcz,
Justin
Sent: Thursday, September 29, 2005 8:41 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Subject: [Veritas-bu] Jobs stuck in the queue?


When using ALL_LOCAL_DRIVES (w/ NEW_STREAM) under it, sometimes I get
some jobs (mainly System_State:\ or Shadow_Copy_Components:\) that get
stuck in the queue.
 
There is no way to delete them except stop and start the NB server
processes.
 
Anyone ever experience anything like that before?
 
Using 5.1mp3a on Solaris 8.

>>> This e-mail and any attachments are confidential, may contain legal,
professional or other privileged information, and are intended solely for
the addressee.  If you are not the intended recipient, do not use the
information in this e-mail in any way, delete this e-mail and notify the
sender. CEG-IP1

<Prev in Thread] Current Thread [Next in Thread>