BackupPC-users

Re: [BackupPC-users] I think I found a bug with 'backups' state files

2014-03-13 10:40:08
Subject: Re: [BackupPC-users] I think I found a bug with 'backups' state files
From: <backuppc AT kosowsky DOT org>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 13 Mar 2014 10:37:51 -0400
Markus wrote at about 12:21:55 +0100 on Thursday, March 13, 2014:
 > Hi list,
 > 
 > I think I found a bug with the pc/backups "state" files, or with file 
 > locking or something like that, not sure what the correct terminology 
 > is. I'm running BackupPC-3.2.1-7.el6.x86_64 on CentOS release 6.2 
 > (Final) 64bit.
 > 
 > Here's what happened:
 > 
 > 1. Every now and then, like every 4 or 5 months, my backup server 
 > crashes, out of memory (not enough RAM) or something like that. When 
 > it's not responding anymore, like it was the case today, when no login 
 > on the console is possible anymore and CTRL+ALT+DEL won't work, I have 
 > to hard reboot it.

Sounds like you found a bug in your Centos setup (or "something like
that" using your lay terminology)
 > 
 > 2. My storage is on a QNAP NAS via iSCSI and it's mounted as 
 > /var/lib/BackupPC. When the server comes back online after a reboot, 
 > everything is fine, all the files are there... almost... because pretty 
 > much the most important files when it comes to BackupPC are *sometimes* 
 > zeroed, namely the pc/backuppc files in each hosts' directory 
 > (/var/lib/BackupPC/pc/hostname/backups).

So, your server crashes suddenly with the filesystem in an open state
and with several BackupPC-related files open and active elements of
the BackupPC process cached in memory... and you are SURPRISED that a
file may be 'zeroed'? And somehow your server running out of memory
and crashing is a backuppc bug?????

 > 
 > 3. So in this particular case, after the reboot, BackupPC reported under 
 > Host Summary: 6 hosts that have been backed up, 21 hosts with no 
 > backups. Wait... it should say 27 hosts have been backed up. So I had a 
 > look at the pc/ directories for all the hosts and the 'backups' file had 
 > 0 bytes for those 21 hosts.

File corruption subsequent to unclean shutdowns of actively mounted RW
partitions is to be expected... be thankful you didn't lose backup
data

 > 4. Fortunately, for most hosts, a backups.old file existed which was 
 > just a few hours older than the backups file, so I copied backups.old to 
 > backups for each host and the backups for 23 hosts now show up again in 
 > the web frontend. Not sure about any inconsistencies but I'm guessing it 
 > doesn't matter or it will figure it out by itself. :)

That's why there are backups... if your backup server is so unstable
either fix it, reboot it safely before it runs out of memory, or make
backups of the key files (including backuppc log files) so that you
have an extra copy when the inevitable file system corruption occurs.
 > 
 > 5. Except for 4 hosts where the backups.old file was also 0 bytes. So 
 > these backups are lost, although the actual backed up data is still 
 > there of course, but I won't bother with recreating the 'backups' file 
 > by hand or something like that, it's not that important, and in 1-2 days 
 > I'll have a good, useable backup again. It will just start from scratch 
 > for those 4 hosts.

Why does backups.old have 0 bytes? I highly doubt that an
unopen/inactive file backups.old is randomly lost unless your unclean
shutdown has caused a lot more filesystem corruption than you
realize. Perhaps there is a greater underlying issue with your backup setup.

 > 6. I "lost" these 4 hosts because I'm not backing up the 'backups' 
 > files. :P   Well, this is the second time now this has happened. I would 
 > say it happens in about 33% of the cases where I have to hard reboot the 
 > server. Please note: all other files are there! Just the 'backups' state 
 > files are 0 bytes, well, for most hosts.

Sounds like you "lost" stuff because you have an unstable system that
randomly shuts down completely.

 > 7. My mount options are (mount -v output): /dev/mapper/data-rz--nas01 on 
 > /var/lib/BackupPC type xfs (rw,_netdev)
 > 
 > I'm speculating BackupPC needs a better "locking" feature of some sorts? 
 > Or another mechanism that this can't happen again. Or maybe I need to 
 > upgrade to the latest version?

I'm not even speculating but you need to fix your server from shutting
down uncleanly. BackupPC is the victim of unclean filesystem
shutdown. It is not a design principle of BackupPC (or for that matter
even a standard CentOS setup or filesystem implementation) to insure
that active and open files survive an unclean shutdown. Indeed, to
insure performance, much pending information is often cached in memory
and not written immediately to disk.

Presumably, the current backups file is read into memory and copied
over to backups.old (as a backup), then the old file is cleared
in preparation to write out the updated information,
and then as appropriate updated/new data is written out to disk when
the backup is finished. 

Perhaps, one could have written the routine to instead append to the
old information to avoid the issue you are having (though this has
some downsides), However, regularly, uncleanly crashing the server is
not one of the design considerations for BackupPC. Plus, in the
unlikely and infrequent event that a server does so crash, the
developer has kindly created a backup copy first before overwriting
it...

BTW, I have a 12 year old P4 server with 2G memory that has crashed
only once in all that time (and it was due to a bug that I found in
NTP caused by the insertion of a leap second one year on Dec 31). I
have had uptimes of up to 3 1/2 years...

 > Thank you!
 > Markus
 > 
 > ------------------------------------------------------------------------------
 > Learn Graph Databases - Download FREE O'Reilly Book
 > "Graph Databases" is the definitive new guide to graph databases and their
 > applications. Written by three acclaimed leaders in the field,
 > this first edition is now available. Download your free book today!
 > http://p.sf.net/sfu/13534_NeoTech
 > _______________________________________________
 > BackupPC-users mailing list
 > BackupPC-users AT lists.sourceforge DOT net
 > List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
 > Wiki:    http://backuppc.wiki.sourceforge.net
 > Project: http://backuppc.sourceforge.net/

------------------------------------------------------------------------------
Learn Graph Databases - Download FREE O'Reilly Book
"Graph Databases" is the definitive new guide to graph databases and their
applications. Written by three acclaimed leaders in the field,
this first edition is now available. Download your free book today!
http://p.sf.net/sfu/13534_NeoTech
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/