El 11/03/2011, a las 14:59, Jeffrey J. Kosowsky escribió:
> Cesar Kawar wrote at about 10:08:10 +0100 on Friday, March 11, 2011:
>>
>>
>> El 11/03/2011, a las 08:04, hansbkk AT gmail DOT com escribió:
>>
>>> On Fri, Mar 11, 2011 at 10:56 AM, Rob Poe <rob AT poeweb DOT com> wrote:
>>>> I'm using RSYNC to do backups of 2 BPC servers. It works swimmingly, you
>>>> plug the USB drive into the BPC server, it auto-mounts, emails that it's
>>>> starting, does an RSYNC dump (with delete), flushes the buffers, dismounts
>>>> and emails.
>>>
>>> Sounds great Rob, would you be willing to post the script?
>>>
>>> Rsync'ing is all fine and good until your hardlinked filesystem (I
>>> don't know the proper term for it, as opposed to the pool") gets "too
>>> big". It's a RAM issue, and an unavoidable consequence of rsync's
>>> architecture - I'm not faulting rsync mind you the kind of filesystem
>>> that BPC (and rdiff/rsnapshot etc) build over time is a pretty extreme
>>> outlier case.
>>>
>> That is not a problem anymore with latests versions of rsync. I've been
>> using this technique for a year now with a cpool of almost 1Tb with no
>> problems.
>>
>> Don't expect it to run on a celeron machine as it requieres big processors.
>> Rsyncing 1Tb of compressed hardlinked data to a new filesystem is a very cpu
>> intensive task. But it does not leak memory as before. You can relay on
>> rsync to mantain a usb disk for off-site bakups.
>
> I think rsync uses little if any cpu -- after all, it doesn't do much
> other than do delta file comparisons and some md4/md5
> checksums. All much more rate-limited by network bandwidth and disk
> i/o.
Not at all. essentially, rsync was designed exactly for the opposite goals of
the ones you mentioned. rsync is bandwidth friendly, but it is very cpu
expensive. The amount of memory needed is much less important than the cpu
needed. Again, from rsync FAQ page:
"Rsync needs about 100 bytes to store all the relevant information for
one file, so (for example) a run with 800,000 files would consume about 80M of
memory. -H and --delete increase the memory usage further."
My firefox requires about double of that memory just to open www.google.com
I know that is "only" to process 800,000 files, but with version 3.0.0 and
later, it doesn't load all the files at once. With a 512 Mb computer you'll be
fine, but in the particular installation I was talking before, 1 Tb of data
comprised of 1 year of historical data (that means a really big number of
hardlink per file), the syncing process takes almost 100% CPU on an Intel Xeon
Quad Core for about 2 hours.
rsync is a really cpu expensive process. You can always use caching for md5
chesums process, but, I wouldn't recommend that on an off-site replicated
backup. Caching introduces a small probability of loosing data, and that
technique is already used when doing a normal BackupPC backup with rsync
transfer, so, if you then resync that data to another drive, disk of filesystem
of any kind, your probability of loosing data is a power of the original one.
Not recomended I think. I prefer to expend a little more money on the machine
once and not have surprises later on when the big boss ask you to recover his
files....
I don't have graphs, but the amount of memory available to any recent computer
is more than enough for rsync. Disk I/O is somewhat important, and disk
bandwidth is a constraint, but, cpu speed is the more important thing in my
tests.
>
> I was under the impression that the slowdown, is due to the need to
> build (and check) lists of hardlinks which is memory
> constrained. Maybe when the list gets really long, cpu power is needed
> to build/sort/lookup the list but I would think that if rsync were
> written well, that this again would not be the rate limiting issue.
>
> Would be interesting for someone to graph performance vs. amount of
> memory and vs. cpu power/speed.
>
> ------------------------------------------------------------------------------
> Colocation vs. Managed Hosting
> A question and answer guide to determining the best fit
> for your organization - today and in the future.
> http://p.sf.net/sfu/internap-sfd2d
> _______________________________________________
> BackupPC-users mailing list
> BackupPC-users AT lists.sourceforge DOT net
> List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
> Wiki: http://backuppc.wiki.sourceforge.net
> Project: http://backuppc.sourceforge.net/
------------------------------------------------------------------------------
Colocation vs. Managed Hosting
A question and answer guide to determining the best fit
for your organization - today and in the future.
http://p.sf.net/sfu/internap-sfd2d
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List: https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki: http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/
|