BackupPC-users

Re: [BackupPC-users] How to manage disk space?

2015-04-15 15:02:07
Subject: Re: [BackupPC-users] How to manage disk space?
From: Dave Sill <de5-backuppc AT sws5.ornl DOT gov>
To: "General list for user discussion,\ questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Wed, 15 Apr 2015 15:00:07 -0400
Thanks for replies. Don't know how I missed the Host Summary page, but
that's useful.

Holger Parplies <wbppc AT parplies DOT de> wrote:
> 
> Les Mikesell wrote on 2015-04-14 09:34:35 -0500 [Re: [BackupPC-users] How to 
> manage disk space?]:
> > On Mon, Apr 13, 2015 at 4:57 PM,  <backuppc AT kosowsky DOT org> wrote:
> > > Dave Sill wrote at about 15:28:49 -0400 on Monday, April 13, 2015:
> > >  > We've been using BackupPC for a couple years and have just encountered
> > >  > the problem of insufficient disk space on the server. [...]
> > >  >
> > >  > What I'd like to know is (1) where is the disk space going,
> > > To store ayour backups
> > >
> > >  > and (2) how can adjust BackupPC to use less space?
> > > Save fewer backups or backup fewer machines
> 
> Jeffrey has a point here. You don't give us much detail to guess on. "A couple
> dozen Linux servers" can mean just about anything.

Well, yeah, but rather than spend hours collecting all of the various
information that could potentially help, being a newbie and not
knowing which details really would help, I thought I'd let people
request further info if it was needed. :-)

> > But more specifically, a likely problem is that you have some very
> > large files like databases, log files, virtual machine images or
> > mailboxes that change daily and thus are not pooled.
> 
> That is one possibility. Another would be keeping several years worth of daily
> history of large mail servers. Either your history is too long (for the disk
> space available), or your backups are too large, or most likely a combination
> of both. Backups may be too large either by design (you need to backup too
> much data) or by malfunction (you are backing up something you don't mean to
> backup).

I suspect they're too large by design. The user is the ORNL DAAC, a
NASA data archive. Pooling helps a lot on system files, I'm sure, but
the bulk of our holdings are data files that probably aren't stored
many times.

My immediate problem was that the disk was full and I needed to figure
out how to get backups running again without adding more space because
none was available. I could take systems/filesystems out of BackupPC
or adjust retention, but I had no idea how much space that would free
up or how quickly that would happen.

> Yet other possibilities would be that BackupPC_nightly is not running, or that
> linking is not working.
> 
> Then again, you might have meant to ask, "how do I find out where the disk
> space is going?".

I thought that's what I asked.

A corollary would be: how do I know that the space BackupPC is using
doesn't include a bunch of cruft like files from systems that have been
removed from BackupPC, or file systems that have been removed, ...

> I can't think of a good answer to that. BackupPC's pooling
> mechanism means that if you have 100 "copies" of one file content (all linked
> to one pool file by BackupPC), deleting 99 of them won't save you anything, as
> long as one remains. Put differently, one host *might* seem very large in
> terms of total backup size, yet share all files with other seemingly smaller
> hosts. You really have to look at your source data: what are you backing up,
> how often does it change, how unique is it? And you have to know your
> constraints. If you *need* to keep a long history of a large amount of data,
> there is nothing much you can do (except from getting more disk space). If you
> don't, the easiest option is to expire old backups and see what happens - just
> keep in mind that you don't get back any disk space for content still present
> in more recent backups.
> Reducing the size of existing backups is somewhat tricky, and reducing the
> size of future backups won't gain you anything until the old backups expire.
> 
> Actually, there might be a way to shed some light. I'd probably look for large
> files with a low link count (-links 2 or 3) in the pc/ tree. You need to be
> aware that 'find' will take a *long* time to traverse such a large pool. It
> just might be worthwhile to run a rather general 'find' command with output
> redirected to a file and then filter that repeatedly to narrow down your
> search, rather than running several different 'find' invocations. Or even
> looking in the {c,}pool/ rather than the pc/ tree (faster, but you don't get
> any file paths, just file content).
> 
> Running 'find $topdir/pc/$host/$num -type f -links -3 -ls' should give you an
> approximate list of files that would actually be deleted by deleting [only]
> backup $num of host $host ('-links -3' takes into account files for some
> reason not linked into the pool; in theory, these *should* all be zero length,
> but in case of some malfunction, they might not).
> 
> Much of that might not make any sense for your particular case, but I hope
> some of it helps.

Thanks, Holger, that does help.

-Dave

------------------------------------------------------------------------------
BPM Camp - Free Virtual Workshop May 6th at 10am PDT/1PM EDT
Develop your own process in accordance with the BPMN 2 standard
Learn Process modeling best practices with Bonita BPM through live exercises
http://www.bonitasoft.com/be-part-of-it/events/bpm-camp-virtual- event?utm_
source=Sourceforge_BPM_Camp_5_6_15&utm_medium=email&utm_campaign=VA_SF
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/