On Tue, Dec 01, 2009 at 09:28:50AM -0500, Jeffrey J. Kosowsky wrote:
> Pieter Wuille wrote at about 13:18:33 +0100 on Tuesday, December 1, 2009:
> > What you can do is count the allocated space for each directory and file,
> but
> > divide the numbers for files by (nHardlinks+1). This way you end up
> > distributing the size each file takes on disks over the different backups
> it
> > belongs to.
> >
> > I have a script that does this; if there's interest i'll attach it. It does
> > take a day (wild guess, never accurately measured) to go over all pc/*
> > directories (Pool is 370.65GB comprising 4237093 files and 4369
> > directories)
>
> I am surprised that it would take a day.
The server is quite busy making backups, and rsync'ing to an offsite backup
server at the same time -- especially the latter puts some serious load on
I/O, i assume.
> The only real cost should be that of doing a 'find' and a 'stat' on
> the pc tree - which I would do in perl so that I could do the
> arithmetic in place (rather than having to use a *nix find -printf to
> pass it off to another program).
Yes, it is a perl script.
> Unless you have a huge number of pc's and backups, I can't imagine
> this would take more than a couple of hours since your total number of
> unique files in only about 4 million.
We have 4 million unique inodes. We do however have some 20-25 million
directory entries, which is what the script needs to read through.
> Given that you only have 4 million unique files, you could even avoid
> the multiple stats at the cost of that much memory by caching the
> nlinks and size by inode number.
Except that the script already needs to do a stat per directory entry in order
to know the inode number itself...
> Can you post your script?
See attachment. You can run eg.:
./diffsize.pl /var/lib/backuppc/pc/*
to see values per host, and a total.
PS: it actually (correctly) divides by (nHardLinks-1) instead of +1 (what i
claimed earlier).
kind regards,
--
Pieter
diffsize.pl
Description: Text Data
------------------------------------------------------------------------------
Join us December 9, 2009 for the Red Hat Virtual Experience,
a free event focused on virtualization and cloud computing.
Attend in-depth sessions from your desk. Your couch. Anywhere.
http://p.sf.net/sfu/redhat-sfdev2dev _______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|