BackupPC-users

Re: [BackupPC-users] Which FS? (was: Keeping servers in sync)

2009-09-06 22:38:20
Subject: Re: [BackupPC-users] Which FS? (was: Keeping servers in sync)
From: higuita <higuita AT GMX DOT net>
To: backuppc-users AT lists.sourceforge DOT net
Date: Mon, 7 Sep 2009 03:35:37 +0100
hi

> Now the typically
> circumstance is that if you copy 1000 files during a backup, those files
> will likely be accessed someone in sequence when you want to retrieve the
> files or whatever.  The problem is that some of the files are physically
> located somewhere else on the disk due to the hardlinks.

        So you are talking about filesystem fragmentation and spread 
        HD usage... this is not a problem of backuppc, but for all apps,
        everytime a file is updated, there is a high chance of increasing
        the "fragmentation" with related data.

        there is little you can do against it, smart defrags and smarter
        filesystems are the only solutions

> de-duplicated.  performance wise, it would be better to have backed up
> those files again and have their data and inodes close together
> clustered with the rest of the files that were backed up from that host
> so that the disk head wouldnt have to continuously go to the beginning
> of the disk.

        unless you have thousands of very small files, it is always 
        faster to just check where is the file in the HD than transfer
        it again, no matter the random access

        even if its the same, there is no way to be sure that a new
        copy is store near the next copied file, the filesystem 
        decides where to allocate the inode and over time you have
        a lot of scatter holes

> more hardlinks = worse seek performance.  this is not because of some
> technical limit, simply logistics of platter size and seek latency when
> data is spread around the disk.

        of course this have a performance hit, but not as much as you
        think...

        firt you already have concurrent backup processes, so the
        HD heads might not even be near the last copied/checked
        file

        second, using rsync with checksum caching, you really do
        little reads, specially when compared with the time spent
        writing new files and waiting for the remote client sends
        the file. checking the cached checksum is just a btree check
        (at least in the recommended backuppc filesystems) and so 
        is very fast

        you can check with iostat, the number of reads is a lot
        less than the writes (unless of course, your clients didnt
        changed any file)

cya
higuita
-- 
Naturally the common people don't want war... but after all it is the
leaders of a country who determine the policy, and it is always a 
simple matter to drag the people along, whether it is a democracy, or
a fascist dictatorship, or a parliament, or a communist dictatorship.
Voice or no voice, the people can always be brought to the bidding of
the leaders. That is easy. All you have to do is tell them they are 
being attacked, and denounce the pacifists for lack of patriotism and
exposing the country to danger.  It works the same in every country.
           -- Hermann Goering, Nazi and war criminal, 1883-1946

Attachment: signature.asc
Description: PGP signature

------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
<Prev in Thread] Current Thread [Next in Thread>