BackupPC-users

[BackupPC-users] Programatically identifying errors in the log file

2009-06-08 17:35:18
Subject: [BackupPC-users] Programatically identifying errors in the log file
From: John Rouillard <rouilj-backuppc AT renesys DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Mon, 8 Jun 2009 21:27:11 +0000
Hi all:

I am using BackupPC 3.1.0 and 3.2.0 beta 0 at the moment.

I am modifying a plugin for nagios to scan the backup logs and verify
that the backups completed successfully. Ideally I would just look for
the backup completion time, verify it's recent enough and then verify
that there were 0 errors during the backup.

However we use daemontools and as a result log files get rotated out
from under the backuppc run. So I can't just look for 0 errors, I need
to count the errors while filtering out certain known (and acceptable)
classes of errors.

Are there rules for identifying true errors in the XferLOG.XXX.z
files?

So far I have the sed script:

  sed -ne '/^Executing/,/^Xfer PIDs are now [0-9]*,[0-9]*$/d' 
       -e '/^Done: /,/^Xfer/d'
       -e '/^[^     ]/p'

which skips the startup sections in the log file:

   everything from the predump command (the line starts with
      Executing) to the point where the there are 2 xfer
      processes/pid's running

  everything from the Done: line (of the prior backup) to the next xfer
      startup.

which skips:

  Done: 144 files, 1109632572 bytes
  incr backup started back to 2009-06-02 15:37:43 (backup #233) for
    directory /var
  Running: /usr/bin/ssh -e none -q -x -l backup -C -o CompressionLevel=9
    -c blowfish-cbc -o ServerAliveInterval=30 example.com /usr/bin/rsync
    --server --sender --numeric-ids --perms --owner --group -D --links
    --hard-links --times --block-size=2048 --recursive --one-file-system
    --bwlimit=62 --checksum-seed=32761 . /var
  Xfer PIDs are now 2868
  Got remote protocol 29
  Negotiated protocol version 28
  Checksum caching enabled (checksumSeed = 32761)
  Xfer PIDs are now 2868,2869

etc. Between these startup sections, I print everything that doesn't
start with whitespace. It does identify/report errors like:

  Remote[2]: file has vanished: 
"/etc/example/example-service/log/main/@400000004a26d70d325235b4.s"

however it will miss errors that occur within the startup sections.  I
suppose I could generate a list of all the valid startup lines, but I
am not sure if some of the phrases I use to match valid lines (e.g.):

  incr backup ...

could also start error lines.

Is there a better way of identifying true errors in the log files?  Am
I missing an ERROR: prefix/suffix that can be used to identify the
true errors?

Thanks.

-- 
                                -- rouilj

John Rouillard       System Administrator
Renesys Corporation  603-244-9084 (cell)  603-643-9300 x 111

------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing 
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>