Veritas-bu

Re: [Veritas-bu] Real time failure notification

2009-02-05 09:01:43
Subject: Re: [Veritas-bu] Real time failure notification
From: Travis Kelley <rhatguy AT gmail DOT com>
To: "Donaldson, Mark" <Mark.Donaldson AT staples DOT com>, veritas-bu AT mailman.eng.auburn DOT edu
Date: Thu, 5 Feb 2009 08:37:26 -0500
Not really.  We have netbackup configured to retry a backup if it
fails so there may be multiple "attempts" under the same jobid.  I
don't want to get an alert if Netbackup is already running the backup
again under another attempt.  I only want to get an alert after
Netbackup has tried the configured number of "attempts" and is failing
the jobid.  The way I see it is most of the time if something is
really broken it won't take long to run through the 5 attempts, fail
the job and alert, but if the box just got to busy and timed out or if
the backup process was killed unintentionally I'd rather Netbackup
handle retrying that on its own and not alert me.

On 2/4/09, Donaldson, Mark <Mark.Donaldson AT staples DOT com> wrote:
> Your "don't alert if retry was successful" automatically excludes the
> idea of a real-time monitor.
>
> It's a bit like saying "Don't alert if you're going to succeed in the
> future".
>
> We "solved" this by creating an after-the-fact monitor for our backups -
> it searches the bpdbjobs output daily and parses that down to return
> code, policy, client, & fileset.  If a fileset fails more than X days in
> a row (without a success in there somewhere) then it's reported on as an
> "endangered fileset".
>
> It'd been decently effective.
>
> -M
>
> -----Original Message-----
> From: veritas-bu-bounces AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Travis
> Kelley
> Sent: Wednesday, February 04, 2009 8:36 AM
> To: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: [Veritas-bu] Real time failure notification
>
> Hi all.  I'm trying to find a solution to a monitoring problem we
> have.  I would like to create a mechanism to alert when a backup fails
> but to only send one alert if multiple streams from a backup fail.
> For instance if c: and d: both fail for a particular box, I only want
> 1 alert.  Also if a job fails twice but is successful on the third
> attempt I don't want an alert at all.  I only want to be alerted once
> when netbackup "gives up" on retrying a backup and fails the job.
> I've looked at backup_exit_notify but haven't been able to find a good
> way to implement this here.  Any ideas?
>
> --
> Sent from my mobile device
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>
>

-- 
Sent from my mobile device
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu