BackupPC-users

[BackupPC-users] Noted Observations & Complaints Using BackupPC for 5 mon

2010-04-22 19:58:30
Subject: [BackupPC-users] Noted Observations & Complaints Using BackupPC for 5 mon
From: Saturn2888 <backuppc-forum AT backupcentral DOT com>
To: backuppc-users AT lists.sourceforge DOT net
Date: Thu, 22 Apr 2010 19:56:42 -0400
@Tyler J. Wagner
If it's confirmed Rsync works better than Rsyncd, I'll switch to it. In my 
experience, transferring over SSH is only a bit slower than FTP so it should 
work just as well. I've also tried adding a -z option into Rsync transfers 
outside of BackupPC and noticed that with -z enabled, I limited myself to an 
11MB/s cap for most of the machines here aside from the higher load on the 
machines doing the transfers. I've gotten close to 50MB/s on my network doing 
Rsync transfers using --progress so I could watch. I understand Rsync isn't 
that fast, but I've had a lot of luck with it being speedy so long as I'm not 
transferring hundreds of gigabytes of files at a time.

Rsync is definitely slowest when --progress is on and you're going through the 
X:\Windows folder. In Windows 7, I used the resource monitor to watch each 
rsync.exe touched, and it spent the majority of its time in X:\Windows. I'm 
thinking this is more of an issue with Ethernet and packet headers than Rsync 
though.

One of the things I forgot to note earlier is the Rsyncd timeout. I could turn 
off a machine and come back to see BackupPC still trying to back it up after 2 
hours time. I don't understand why there's a timeout like that which makes 
BackupPC keep thinking it's working properly wasting that time I could've used 
to backup another host which actually was online.

The problem with SSH Rsync configurations is the public/private keys. I seem to 
always have problems getting those setup, but I've recently not had those 
problems so I can try that method, but I've never done it before on Windows 
machines and will probably have great difficulty doing so. DeltaCopy has an 
ssh.exe, but I believe that is only a client, not a server. I do have PuTTY on 
some of the machines if that's helpful at all.


@Sorin Srbu
I tested each and every new DLL and EXE file one by one until I found 
combinations that worked and didn't crash DeltaCopy. Rsync 3.0.7-1 doesn't 
work, you need 3.0.6-1. The only files I was successful in updating were 
chmod.exe, rsync.exe, ssh.exe, and cygwin1.dll. The other files would cause 
crashes, and I don't know enough about Cygwin to figure out anything else. I 
did these DLL/EXE test using Windows Vista 64-bit and Windows 7 32-bit.

I'm assuming the only reason you get over 14MB/s is because of your processor 
as Rsync, from all of my observations, is limited by the processor.


@Ryan Manikowski
Heh, Ryan, I know :P. I love BackupPC; if I didn't, I wouldn't be here talking 
about it trying to get more information and publicizing my experiences with it. 
Really, I'd just find something else if I didn't care to use it anymore or 
didn't understand the constraints it was under. Still, I could not call what I 
posted anything more than observations and complaints. If you experienced those 
problems and spent all that time testing and trying out so many different 
configurations, I'm sure you'd be complaining as well. Maybe it's not that 
these are complaints about the software, they're complaints from my lack of 
understanding of how all these separate parts fit together. Still, I'm trying 
really hard to learn everything I can higher-level software is intended to 
alleviate that stress, not heighten it.

I cannot code, therefore I cannot give back in that way. I have proposed 
changes and asked about other before and even e-mailed Chris praise on the 
project he headed. The only thing more I can do is write this post and answer 
everyone's questions for me. Please do not misundestand my purpose.


@Les Mikesell
I thought this was the mailing list. I've seen things frequently posted in both 
areas with replies in both areas. I'm assuming that you mean the mailing list 
posts to the forums whereas it's not the forum that posts to the mailing list.

I apologize, but I do not know how to join or use a mailing list. This is one 
of those where it's not so easy to find instruction on doing so. To me, the 
mailing list is like this elite corps I'm unable to join because I have no 
information on it, what it is, or how to gain information on it.

RAID3 came with the card and was the only configuration for 3 drives. It's 
fault-performance version of RAID5 because if one drive dies, you're running a 
RAID0 whereas in RAID5, your performance becomes sluggish from all the mixed 
parity bits.

RAID5 /is/ slow for write speed, but it can't be less than the speed of 1 drive 
right? When I dd'd the information from the RAID5 to the single drive, the 
first time it went at ~12MB/s but I has initiated the transfer incorrectly. The 
second time I partitioned the new drive out and then did the dd. From there I 
was getting a minimum 75MB/s or more. I'm assuming this was the continuous 
write speed of the new drive limiting the read speed of the RAID5. If the read 
speed could even go at least 75MB/s, Rsync should've been very very fast as it 
was only needing to read and compare files since most anything I have was 
already backed up and in the pool.

I have 2GB of memory in this machine and enabled commit=60 which may have 
actually been a cause for it slowing down, I do not know. I've been meaning to 
upgrade or change over this rig to an Atom D510 or N330, but I cannot find one 
with more than 2 slots for memory leading me to believe it might be best to 
keep it as-is. Would HyperThreading actually make it work faster at all even 
though most of it is I/O? Or is Rsync that big of a cultprit? Should I look 
around for deals on memory and get my system up to at least 4GB of RAM? Here is 
/etc/fstab:

# <file system> <mount point>   <type>  <options>       <dump>  <pass>
proc            /proc           proc    defaults        0       0
# /
/dev/sda1 / ext3 
noatime,errors=remount-ro,usrquota,grpquota,data=writeback,commit=60 0 1
# swap
/dev/sda5 none swap sw 0 0

It used to be that my swap was even on another drive entirely which I can do 
again if you guys suggest that for speed reasons.

I'm unable to read iostat and need to receive instructions to do so. This is 
why I use iotop as it makes sense. I don't understand why it's faster to 
transfer all the content again than use Rsync to see if files need to be 
redownloaded. Elapsed time is what I'm having the most problems with, not the 
download speed itself as it could show 0.0000001MB/s, and I'd be happy, let a 
little concerned, if it only took 5 minutes.

Windows permission locking? I'm sure that's wrong. I could hardly assume that 
C:/Program Files (x86)/Pidgin would be locked by the OS. I can't see any reason 
not to back that folder up as I believe I'm able to read it regardless of the 
files being in use or not. The XferLOG files are also very large and 
painstaking to navigate. I hope you have better tools to recommend me other 
than Notepad++ or another text editor with search functionality. It may be nice 
to understand regular expressions to assist in my search, but I don't think it 
should really be that complicated to find this issue. What I mean is, I had a 
C: and G: drive. The G: drive was almost never in use, and it was months before 
I ever changed data on it yet part-way through backing it up, Samba would just 
be like "error" for all the files in the XferLOG and none of them would 
transfer. I tried a few things and was unable to get it to fully back them up 
not in the full nor the incrementals. Were it not specifically for this, I 
would not have moved to Rsyncd.

I figured incrementals might take longer because of that. Are you suggesting I 
do full backups each night instead of incrementals or should I do 0 to 6 
instead of, for example, 0 to 60 (incremental levels).

The logs are showing me all folder creations along with files. It's difficult 
to disiver create d from create  . The error log also doesn't show which files 
were newly created which could assist me in figuring out which files are 
actually taking up a gigabyte's worth of backups that night.

For a while, I saw certain, but not all, excluded files appearing in backups. 
After much time and diliberation, I went to #rsync and man rsync to find out 
what all was required for the exclude list. Through this, I made this 
fairly-organized list of global excludes for both Windows and Linux machines.

$Conf{BackupFilesExclude} = {
  '*' => [
    '*.tmp',
    'tmp/',
    'temp/',
    'Temp/',
    'Temporary Internet Files/',
    '/dev',
    '/media',
    '/mnt',
    '/proc',
    '/sys',
    '/var/lib/backuppc',
    'pagefile.sys',
    'hiberfil.sys',
    'RECYCLER',
    '$RECYCLE.BIN',
    '$Recycle.Bin',
    'desktop.ini',
    'Thumbs.db',
    'thumbcache_*.db',
    'IconCache.db',
    '*.edb',
    '/Windows/Prefetch',
    '/home/dan/excludes',
    '/home/ddr/logs/',
    '/home/sokg/logs/',
    '/home/saturn2888/logs/',
    '/umkcddr.com/extg',
    '/Program Files (x86)/Electronic Arts/Burnout(TM) Paradise The Ultimate 
Box/*',
    '/Folders/Learn Japanese',
    '/Folders/zhid-e'
  ]
};

I've already factored in growing databases and that would explain issues in 
Linux servers, but definitely not in my windows systems especially if indexing 
is disabled. Only one computer is running an e-mail program and those files are 
not part of BackupPC's backups.

I don't think my directory trees are getting that much larger. In fact, I'm 
someone who tries to stay organized so I often get rid of and simplify things 
constantly. The only directory that ever changes on my laptop or netbook is the 
Downloads folder as it may be the case I add another directory there. I even 
copy-back my Pidgin IM logs to my main rig when they begin to grow to increase 
the speed of backups but that doesn't seem to have worked at all.

What is --checksum-seed=32761?

Yes, running more than 2 seems to be slower, but not back when I was using 
Samba. Since I'm not downloading many files, wouldn't you believe then that 
it'd be okay to run more at the same time? I have had the best experiences 
running 4 at the same time because if I limit it to 2, it might be an entire 
day before even finishing the previous day's backups whereas that somehow does 
not happen with 4 concurrent. I'd also like to think my RAID5 was well-tuned, 
but I cannot be sure.


@Jeffrey J. Kosowsky
Where may I submit such bug reports? I remember reading somewhere that 
SourceForge was, in fact, not the place to that, or was it that the SourceForge 
copy of the mailing list is usually old? How do I know these are really bugs at 
all? I'm here to find answers and if my observations are showing bugs, I will 
gladly report them.

Was my whole introduction before the first ./01\. enough of an explanation of 
all my trials and tribulations of troubleshooting? Look at all I have tested. 
Entire disk arrays, modifying fstab, upgrading, changing switches and Ethernet 
cables, enabling/disabling HyperThreading, modifying many configuration options 
in all sorts of different capacities, testing wireless and wired machines, and 
quite a bit more. What more do you want? I have tested naked speeds, this is 
why I believe there was something worth posting about. The one suggestion you 
noted I do not have is to try smaller backups instead of larger ones, but I'd 
say, by backing up my websites on an extremely constrained connection in 
comparison to my LAN, I have sufficiently done that as well, only not on my 
local machines. I may try backing up only a folder of a host I'm already 
backing up to compare the speeds of those. I appreciate your suggestions, but 
this is the only one I haven't seem to do already.

I have plenty of ideas for how to fix the logs. Color-coding changes like SVN, 
sectioning, and a bunch of other things I cannot remember at the moment. I 
would gladly contribute these if I knew where to do so. Please do not 
misunderstand my purpose. I believe I've tried to do my part in paying back 
with my support (as in this post), help of others, and constant troubleshooting 
to make sure I'm not the one at fault so I don't waste anyone's time.


@Carl Wilhelm Soderstrom
I've actually experimented a lot with the incremental levels. I once had a 
strange 1, 2, 3, 2, 3, 4, 5, 3, 4, 5, 6, 4, 5, 6, 7, 6, 7, 6, 7, 8 arrangement 
even for the sake of testing. That arrangement go replaced by my most-recent 1 
to 61 which may or may not be actually slower than 1 to 6. It seems incredibly 
fast for machines WAN-side, but not at all for machines LAN-side. I even setup 
a VPN at a friend's house and have been backing up his Windows Vista 64-bit 
system using Rsyncd over VPN without speed issue. His upload closely matches my 
download speed so that may also affect it some as well as the fact that what 
I'm backing up is a small group of folders on his computer rather than the 
entire operating system.

Scroll around my reply to Les Mikesell where I have posted my fstab. Maybe you 
can help me configure it correctly.

What were you referring to with your post on the ssh compression ratio? I don't 
understand this part at all. I do use Rsync over SSH but only for some websites 
and with those, I don't have speed issues. Can you clarify where these settings 
are and for what reason I would want to change them? I might find these 
settings very helpful.

+----------------------------------------------------------------------
|This was sent by Saturn2888 AT gmail DOT com via Backup Central.
|Forward SPAM to abuse AT backupcentral DOT com.
+----------------------------------------------------------------------



------------------------------------------------------------------------------
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/

<Prev in Thread] Current Thread [Next in Thread>