Bacula-users

Re: [Bacula-users] Job transfer rate

2014-10-30 11:55:53
Subject: Re: [Bacula-users] Job transfer rate
From: Bryn Hughes <linux AT nashira DOT ca>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 30 Oct 2014 08:53:07 -0700
On 14-10-30 08:27 AM, Jeff MacDonald wrote:
>> On Oct 30, 2014, at 12:17 PM, Bryn Hughes <linux AT nashira DOT ca> wrote:
>> The job report rate will be the final average rate of the job, it doesn't 
>> know/specify the difference between the 'input' rate and the 'output' rate.
>>
>> Yep, you're going to need to do some investigation on the storage side of 
>> the VM machine you are backing up, the director itself, the storage daemon 
>> itself (though I'm guessing it is on the same system as the director for 
>> you) and the final storage.
>>
>> Also it's not quite clear from your description, is the final storage on a 
>> different NAS all together from your VMs? (hoping so!)  What virtualization 
>> platform are you running?
>>
>> Finally the question about attribute spooling is a big one - if you are 
>> backing up a lot of small files and you do not have attribute spooling 
>> turned on, you will have abysmal performance especially if the director is 
>> running on the same disks that you are backing up.
>>
>> Database writes are (almost) always synchronous writes, meaning the system 
>> will stop and wait for the storage layer to say "yes the data is ACTUALLY 
>> committed to disk" before proceeding.  If you are seeking all over backing 
>> up a bunch of small files, then trying to do a whole ton of tiny DB writes 
>> at the same time to the same spindles your hard drive heads are going to be 
>> flying around like crazy.  An array of 7200 RPM disks in any sort of parity 
>> RAID configuration will not be able to handle more than 50-90 random IOPs 
>> (Operations per Second) at best in real life, with a DB write or a file read 
>> counting as an IOP.  If you are backing up lots of small files randomly 
>> distributed around the storage you are quite likely hitting an IOP wall - an 
>> IOP to read the file and an IOP to write the DB record means not more than 
>> 25-45 files per second.  4kb files = 100-180kb/sec and a completely maxed 
>> out storage layer.
>>
>> Even WITH attribute spooling enabled you are still going to be in a 
>> less-than-ideal position since the spooled attributes still need to be 
>> written to the same spindles with the hardware configuration you've 
>> described.
>>
>> Bryn
>>
> This was really helpful and basically just answered all of my questions 
> without having to investigate the actual setup very much.
>
> I’m using VMWare for my virt platform. Bacula and its postgres live on the 
> same disks that they are backing up (which is local storage) and data is sent 
> off to to a remote NAS via gige.
>
> My guess is that its an IOP wall like you mentioned.Its running a bunch of 
> VMs that are under heavy usage by the staff.
>
> Making a stronger and stronger arguement for me to recommend dedicated bacula 
> appliance. 16 gigs of ram, 4 cores. 1tb of 7200 for postgres and a tape drive 
> :)
>
> jeff.
>
Just be aware that you might not see a dramatic increase in speed just 
moving Bacula itself!

If you are using VMWare with VMDK files on a VMFS volume you need to be 
aware that any IO by a guest requires a reservation of the entire VMFS 
volume.  Locking is happening at the SCSI layer - if one guest wants to 
read one byte of data nobody else can do anything until its IO operation 
is complete.  Remembering that you probably are only going to get around 
75 IOPs you can see how a VMFS volume with more than a handful of 
virtual machines on it can very quickly end up performing very poorly, 
especially with spinning rust underneath it.  A good RAID card with a 
LOT of cache memory can help with overall system performance, but 
backups by definition are going to be touching lots of areas of data 
that aren't likely to be in cache.

What I'm getting at is you might actually need to focus your efforts and 
dollars on the storage underneath your VMs before you do too much with 
your backup system.  A great big nice happy dedicated Bacula server 
would be nice, but if the VMs are still IOP constrained ESPECIALLY if 
they are actively in use while being backed up you probably won't see 
that much of an improvement.

An easy way to validate this would be to ensure you have attribute 
spooling turned on and to set up the attribute spooling to write to your 
NAS rather than to local storage.  That will get the VM storage 
infrastructure out of your backup pathway.

Bryn


------------------------------------------------------------------------------
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users