Pieter,
Thank you for kindly providing this script. I also have an accounting need to
get some sort of reasonable estimate for how much space they are occupying, and
I don't want to do a du.
Forgive my ignorance, but can you give human readable explanations for the
abbreviations?:
total: alloc=x dalloc=x dentries=x dsize=x falloc=x fcount=x fsize=x
I have guesses but I would prefer to hear it from the guy who wrote it.
Thanks!
Kyle Anderson
Tummy.com
Pieter Wuille wrote:
> On Tue, Dec 01, 2009 at 09:28:50AM -0500, Jeffrey J. Kosowsky wrote:
>> Pieter Wuille wrote at about 13:18:33 +0100 on Tuesday, December 1, 2009:
>> > What you can do is count the allocated space for each directory and file,
>> but
>> > divide the numbers for files by (nHardlinks+1). This way you end up
>> > distributing the size each file takes on disks over the different backups
>> it
>> > belongs to.
>> >
>> > I have a script that does this; if there's interest i'll attach it. It
>> does
>> > take a day (wild guess, never accurately measured) to go over all pc/*
>> > directories (Pool is 370.65GB comprising 4237093 files and 4369
>> > directories)
>>
>> I am surprised that it would take a day.
> The server is quite busy making backups, and rsync'ing to an offsite backup
> server at the same time -- especially the latter puts some serious load on
> I/O, i assume.
>
>> The only real cost should be that of doing a 'find' and a 'stat' on
>> the pc tree - which I would do in perl so that I could do the
>> arithmetic in place (rather than having to use a *nix find -printf to
>> pass it off to another program).
> Yes, it is a perl script.
>
>> Unless you have a huge number of pc's and backups, I can't imagine
>> this would take more than a couple of hours since your total number of
>> unique files in only about 4 million.
> We have 4 million unique inodes. We do however have some 20-25 million
> directory entries, which is what the script needs to read through.
>
>> Given that you only have 4 million unique files, you could even avoid
>> the multiple stats at the cost of that much memory by caching the
>> nlinks and size by inode number.
> Except that the script already needs to do a stat per directory entry in order
> to know the inode number itself...
>
>> Can you post your script?
>
> See attachment. You can run eg.:
>
> ./diffsize.pl /var/lib/backuppc/pc/*
>
> to see values per host, and a total.
>
> PS: it actually (correctly) divides by (nHardLinks-1) instead of +1 (what i
> claimed earlier).
>
> kind regards,
>
>
>
> ------------------------------------------------------------------------
>
> ------------------------------------------------------------------------------
> Join us December 9, 2009 for the Red Hat Virtual Experience,
> a free event focused on virtualization and cloud computing.
> Attend in-depth sessions from your desk. Your couch. Anywhere.
> http://p.sf.net/sfu/redhat-sfdev2dev
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
signature.asc
Description: OpenPGP digital signature
------------------------------------------------------------------------------
This SF.Net email is sponsored by the Verizon Developer Community
Take advantage of Verizon's best-in-class app development support
A streamlined, 14 day to market process makes app distribution fast and easy
Join now and get one step closer to millions of Verizon customers
http://p.sf.net/sfu/verizon-dev2dev _______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|