BackupPC-users

Re: [BackupPC-users] Which FS? (was: Keeping servers in sync)

2009-09-05 21:06:25
Subject: Re: [BackupPC-users] Which FS? (was: Keeping servers in sync)
From: dan <dandenson AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Sat, 5 Sep 2009 19:03:34 -0600
/>> into a new directory, that obviously can't be done so you end up with a
>> lot of long seeks when you try to traverse directories picking up the
>> inode info.

I believe you are mistaken in this.  Your confusing directory entries
with inode entries.  When you hard link a file from one directory to
another you have two directory entries pointing to the same inode.
You can do a simple test by touching a file and then make a hard link
to a new file and list with "ls -li".  You will see that both files
share the same inode number.

I think that some people are agreeing on this but are not aware that they are saying the same thing.

Think of it like this.  You make one backup.  You write data, inode, and directory entry.  now 6 months later you do another backup.  when this one goes to write data, it puts the file data on disk pretty far away from the inode and writes an entry to point to the inode.  Now the typically circumstance is that if you copy 1000 files during a backup, those files will likely be accessed someone in sequence when you want to retrieve the files or whatever.  The problem is that some of the files are physically located somewhere else on the disk due to the hardlinks.  This causes a long seek because the data is not clustered together like you would have if you didnt use hardlinks as is dont with backuppc.

Over time this gets more intense as the inode and the data are located near the beginning of the disk for old files that are being heavily de-duplicated.  performance wise, it would be better to have backed up those files again and have their data and inodes close together clustered with the rest of the files that were backed up from that host so that the disk head wouldnt have to continuously go to the beginning of the disk.

more hardlinks = worse seek performance.  this is not because of some technical limit, simply logistics of platter size and seek latency when data is spread around the disk.
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with 
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
<Prev in Thread] Current Thread [Next in Thread>