BackupPC-users

Re: [BackupPC-users] Help before giving up on BackupPC

2014-05-15 21:30:52
Subject: Re: [BackupPC-users] Help before giving up on BackupPC
From: Adam Goryachev <mailinglists AT websitemanagers.com DOT au>
To: backuppc-users AT lists.sourceforge DOT net
Date: Fri, 16 May 2014 11:29:13 +1000
On 16/05/14 10:40, Marco Nicolayevsky wrote:

Tim,

 

First off, thank you very much for your insightful and helpful approach, despite the limited information I provided. And thank you as well for being the “unpaid” mechanic (good analogy). I will try to present a detail account. Please read my comments to your questions below:

 

Environment:

·         Server (using gigabit wired Ethernet on a managed switch)

o   running under VMWare ESX5.5

o   VM allocated w/ 4gb ram, all procs/cores, and max cpu/mem bursts

o   intel i5-3330 3GHz

o   24gb ram

o   Backup location 5 disk hardware raid 5 under Centos 6.5

§  Backup avail space 6 TB

·         Windows Workstation Client (using gigabit wired Ethernet on a managed switch)

o   Intel i7-4770K cpu 3.5GHz

o   32 gb ram

o   2 internal logical drives

§  C: HW raid 0 comprising of TWO 120GB SSD drives (238gb capacity)

·         89.5GB used (162,337 files)

§  D: HW raid 5 consisting of 4 x 2TB (5589gb capacity)

·         1.5TB used (213,594 files)


I'm assuming both machines are running under vmware, but are both on the same physical box? If so, then you will have contention on CPU and HDD's. In any case, bandwidth is unlikely to be the issue.

 

I have not completed a full backup of this workstation yet. I have done many incremental removing exclusions in an attempt to incrementally backup the entire computer (drive c &d) in steps.

I will disable compression and re-attempt, although I will not be able to report the information for xx days until its complete.


Personally, I'd suggest running a full backup, and just keep an eye on the log files to ensure it is making progress. As long as it makes progress, then wait as long as needed for completion. Once completed, run a 2nd, and then a 3rd. Now look at the time it took for the 3rd backup and decide if that will work for you. (Make sure you have enabled checksum-seed/caching).

If it won't work for you, then you will need to find and fix the bottleneck. At the core, backuppc is using rsync and maybe compression (CPU), building a list of all files, and doing some work in memory (RAM consumption), and obviously reading all your files (yes, a full backup will read all of every file) and writing the changed files (Disk I/O). In addition, when reading all the files you might want to determine whether your anti-virus solution is slowing things down, and configure as desired (ie, do you want to re-scan the entire dataset during every backup, or can you exclude the backup system (rsyncd) from the antivirus).

So I guess compression is out. I will disable. Thank you for the info.

Compression should only be an issue after the initial backup when:
1) There are a lot of changed files
2) There are changes to large files (eg, pst files or disk images etc)

Since unchanged files will not require de-compression or re-compression (if you enable the checksum caching, and have made more than 2 full backups).

> Also, those are fulls.  Incrementals take 4 hours and 1 hour, respectively.

> Is rsync “really” that slow?

> 

> No:  rsync is only going to be marginally slower than a non-rsync copy, even on the first time, assuming you're not limited by something else (CPU or RAM) that would not be a limit for a normal copy.

> 

> That could be related to the number of files:  that's an area where rsync can get tripped up.  As you can see, I've got >1 million files, so the definition of "too many" is pretty big.  But if you had, say, 10M files,

> maybe that's an issue to consider.

 

I don’t have anywhere near a million files, so I guess my workstation should be quicker. I suspect vmware is causing *some* penalty, but it should not be that great. Regardless, it’s what I have so that’s got to work fine.

Actually, rsync will be slower than other copies, but usually not that much. rsync tries to be clever, so it will use more CPU, but sometimes the bandwidth is faster than the CPU, so it will end up slower. In any case, it is a reminder to ensure you have sufficient CPU resources for the job you are doing. Also, remember it is single core, so if you have one core at 100%, and another 100 cores at 0%, you are still CPU limited.

> Another option would be to expand your testing from a very small section to something larger:  say, 100GB.  That is big enough to be somewhat representative of the whole, but should be able to complete

> quickly enough even with compression, encryption, etc. to get some baseline numbers to work with, including both the *first* and *additional* full backups.  That way, you might find that the initial backup will

> take a week, but each additional backup after that will only take 12 hours and you're OK with that.  Or, you might find that things are still broken, but now it won't cost you a week of your life every time you

> want to test.

GREAT idea. Before I embark on trying to do a full backup, I’ll isolate ONE folder with some large files (around 100gb) and focus on doing some tests on that with/without compression. I will report my findings.


Yes, or even 100MB. Just change the rsyncd.conf to make the "share" point to a smaller folder. Once you have it working, you can alter the rsyncd.conf as required, gradually moving up to higher level folders (larger data sets). Remember to only look at the time for the 3rd full backup after every change.

Hope this is of some assistance.

Regards,
Adam

--
Adam Goryachev Website Managers www.websitemanagers.com.au
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/