BackupPC-users

Re: [BackupPC-users] Problems with hardlink-based backups...

2009-08-18 10:29:20
Subject: Re: [BackupPC-users] Problems with hardlink-based backups...
From: David <wizzardx AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Tue, 18 Aug 2009 16:25:43 +0200
Thanks for the replies


On Mon, Aug 17, 2009 at 3:05 PM, Les Mikesell<lesmikesell AT gmail DOT com> 
wrote:
> You can exclude directories from the updatedb runs

Only works if the data you want to exclude (such as older snapshots)
are kept in a relatively small number of directories, or you need to
make a lot of exclude rules, like one for each backup. In my case,
each backed up server/user PC/etc, is independant, and has it's own
directory structure with snaphots, etc.

And actually backuppc also has a problematic layout for locate rules:

__TOPDIR__/pc/$host/nnn <- One of those directories for each backup version.

So basically, if you have a large number of files on a server, it
seems like you need to entirely exclude the server from updatedb,
otherwise the snapshot directories are going to cause a huge updatedb
database.

Which kind of defeats the point of having updatedb running on the
backup server. Which is why I've disabled it here :-(.

> Du doesn't make any files unless you redirect its output

Usually I make du files on servers, so I can copy the files back to my
workstation, and use a graphical tool like xdiskusage to get a better
idea of where space is used.

>- and it can be constrained to the relevant top
> level directories with the -s option.

Yep, but it is still going to take days :-(. And then afterwards you
often still need to run 'du' on those lower levels to see where the
space is actually going.

> Backuppc maintains its own status showing how much space the pool uses and how
> much is left on the filesystem. So you just look at that page often enough to
> not run out of space.

Sounds like a 'df'- like display on the web page, but for the backuppc
pool rather than a partition.

Please correct me if I'm mistaken, but that doesn't really help people
who want to find which files and dirs are taking up the most space, so
they can address it (like, tweak the number of backed up generations,
or exclude additional directories/file patterns, etc).

Normally people use a tool like 'du' for that, but 'du' itself is next
to unusable when you have a massive filesystem, which can easily be
created by hardlink snapshot-based backup systems :-(

>
> Backuppc won't start a backup run if the disk is more than 95% (configurable) 
> full.
>

Sounds useful, but it doesn't really address my problem of 'du' (and
locatedb, and others) having major problems with this kind of backup
layout.

>
> It is best done pro-actively, avoiding the problem instead of trying to fix it
> afterwards because with everything linked, it doesn't help to remove old
> generations of files that still exist.  So generating the stats daily and
> observing them (both human and your program) before starting the next run is 
> the
> way to go.
>

1. Removing old generations does help. The idea is to remove old
"churn" that took place in that version. In other words, files which
no longer have any references after that generation is removed
(because all previous generations referring to those files via hard
links, are also gone by this point).

2. Proactive is good, but again, with a massive directory structure,
it's hard to use tools like du to check which backups you need to
finetune/prune/etc.

>
> Also, you really want your backup archive on its own mounted filesystem so it
> doesn't compete with anything else for space and to give you the possibility 
> of
> doing an image copy if you need a backup since other methods will be too slow 
> to
> be practical.  And 'df' will tell you what you need to know about a filesystem
> fairly quickly.
>

Our backups are stored under a LVM which is used only for backups. But
again, the problem is not disk usage causing issues for other
processes. The problem is, once the allocated area is running out of
space, how to check *where* that space is going to, so you can take
informed action. 'df' is only going to tell you that you're low on
space, not where the space is going.

- David.

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/