[BackupPC-users] Programatically identifying errors in the log file
2009-06-08 17:35:18
Hi all:
I am using BackupPC 3.1.0 and 3.2.0 beta 0 at the moment.
I am modifying a plugin for nagios to scan the backup logs and verify
that the backups completed successfully. Ideally I would just look for
the backup completion time, verify it's recent enough and then verify
that there were 0 errors during the backup.
However we use daemontools and as a result log files get rotated out
from under the backuppc run. So I can't just look for 0 errors, I need
to count the errors while filtering out certain known (and acceptable)
classes of errors.
Are there rules for identifying true errors in the XferLOG.XXX.z
files?
So far I have the sed script:
sed -ne '/^Executing/,/^Xfer PIDs are now [0-9]*,[0-9]*$/d'
-e '/^Done: /,/^Xfer/d'
-e '/^[^ ]/p'
which skips the startup sections in the log file:
everything from the predump command (the line starts with
Executing) to the point where the there are 2 xfer
processes/pid's running
everything from the Done: line (of the prior backup) to the next xfer
startup.
which skips:
Done: 144 files, 1109632572 bytes
incr backup started back to 2009-06-02 15:37:43 (backup #233) for
directory /var
Running: /usr/bin/ssh -e none -q -x -l backup -C -o CompressionLevel=9
-c blowfish-cbc -o ServerAliveInterval=30 example.com /usr/bin/rsync
--server --sender --numeric-ids --perms --owner --group -D --links
--hard-links --times --block-size=2048 --recursive --one-file-system
--bwlimit=62 --checksum-seed=32761 . /var
Xfer PIDs are now 2868
Got remote protocol 29
Negotiated protocol version 28
Checksum caching enabled (checksumSeed = 32761)
Xfer PIDs are now 2868,2869
etc. Between these startup sections, I print everything that doesn't
start with whitespace. It does identify/report errors like:
Remote[2]: file has vanished:
"/etc/example/example-service/log/main/@400000004a26d70d325235b4.s"
however it will miss errors that occur within the startup sections. I
suppose I could generate a list of all the valid startup lines, but I
am not sure if some of the phrases I use to match valid lines (e.g.):
incr backup ...
could also start error lines.
Is there a better way of identifying true errors in the log files? Am
I missing an ERROR: prefix/suffix that can be used to identify the
true errors?
Thanks.
--
-- rouilj
John Rouillard System Administrator
Renesys Corporation 603-244-9084 (cell) 603-643-9300 x 111
------------------------------------------------------------------------------
Crystal Reports - New Free Runtime and 30 Day Trial
Check out the new simplified licensing option that enables unlimited
royalty-free distribution of the report engine for externally facing
server and web deployment.
http://p.sf.net/sfu/businessobjects
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|
<Prev in Thread] |
Current Thread |
[Next in Thread>
|
- [BackupPC-users] Programatically identifying errors in the log file,
John Rouillard <=
|
|
|