Bacula-users

Re: [Bacula-users] Multiple CPU core support, 100% usage while restore

2011-01-04 09:19:19
Subject: Re: [Bacula-users] Multiple CPU core support, 100% usage while restore
From: "Dan Langille" <dan AT langille DOT org>
To: "Tom Sommer" <mail AT tomsommer DOT dk>
Date: Tue, 4 Jan 2011 09:14:08 -0500
On Tue, January 4, 2011 8:47 am, Tom Sommer wrote:
> On Tue, January 4, 2011 13:55, Tom Sommer wrote:
>> On Tue, January 4, 2011 12:59, Dan Langille wrote:
>>
>>> On 1/4/2011 5:00 AM, Tom Sommer wrote:
>>>
>>>
>>>> On Tue, January 4, 2011 03:15, Dan Langille wrote:
>>>>
>>>>
>>>>> On 1/3/2011 12:57 PM, Tom Sommer wrote:
>>>>>
>>>>>
>>>>>
>>>>>> I'm currently restoring 1.5 mill. files and it's taking forever.
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> bacula-sd is using 100% CPU, disk IO is apparently low, so I
>>>>>> assume it's a CPU issue.
>>>>>>
>>>>>> My machine has 16GB RAM and 2 CPUs:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> top - 18:54:53 up 75 days, 10:16,  3 users,  load average: 1.00,
>>>>>> 1.00,
>>>>>> 1.00
>>>>>> Tasks: 172 total,   1 running, 171 sleeping,   0 stopped,   0
>>>>>> zombie Cpu0  : 85.3%us,  1.0%sy,  0.0%ni, 12.7%id,  0.0%wa,
>>>>>> 0.0%hi,
>>>>>> 1.0%si,
>>>>>> 0.0%st
>>>>>> Cpu1  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu2  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu3  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu4  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu5  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu6  :  0.0%us,  0.0%sy,  0.0%ni,100.0%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Cpu7  : 11.9%us,  0.0%sy,  0.0%ni, 88.1%id,  0.0%wa,  0.0%hi,
>>>>>> 0.0%si,
>>>>>> 0.0%st
>>>>>> Mem:  16429812k total, 16345744k used,    84068k free,     1272k
>>>>>> buffers Swap:  3124632k total,      184k used,  3124448k free,
>>>>>> 5690688k
>>>>>> cached
>>>>>>
>>>>>> PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+
>>>>>> COMMAND
>>>>>> 4014 root      18   0  109m  19m 1264 S 100.3  0.1 135:07.23
>>>>>> bacula-sd
>>>>>>
>>>>>>
>>>>>>
>>>>>> As you can see the process takes 100% CPU usage, on 1 core. Is
>>>>>> there any way to make Bacula use all cores? or any other way to
>>>>>> speed up the restore - By the looks of it, it could take days to
>>>>>> restore the data.
>>>>>
>>>>> What stage of the restore is occurring?  is it building the file
>>>>> tree? Has the restore started?
>>>>>
>>>>
>>>> It's runnning - sending the files to the server.
>>>>
>>>>
>>>>
>>>> 2 days in, it's completed 50GB of a total of ~220GB.
>>>>
>>>>
>>>>
>>>> Not really that impressive.
>>>>
>>>>
>>>>
>>>> My files are stored in blocks of 10GB.
>>>>
>>>>
>>>
>>> Sounds like spooling or database index issues.  I'm guessing you are
>>> using MySQL and there are no spool options on your job / bacula-sd.
>>>
>>> The database index issues have been discussed previously on this list.
>>> You should be able to find them.  Look at that first before thinking
>>> about spooling.
>>
>> I'm thinking it might be due to compression? Does that make sense?
>
> I ran strace -f -p [PID] on the process, resulting in a flood of:
>
> [pid  4185] write(5, "\0\0\0%rechdr 8742 1287561532 32497"..., 41) = 41
> [pid  4185] write(5,
> "\0\0u`x\234\214\275ko\344\332\262$\366]\277\202\200\321\200\rB}vK\352\327G\333sa"...,
> 30052) = 30052
> [pid  4185] write(5, "\0\0\0%rechdr 8742 1287561532 32497"..., 41) = 41
> [pid  4185] write(5,
> "\0\0u\334x\234l\275\333\226\333\310\316$|\357\247\320\\\370Rn\273N\266\237\346_I2E\246"...,
> 30176) = 30176
> [pid  4185] write(5, "\0\0\0%rechdr 8742 1287561532 32497"..., 41) = 41
>
> [pid  4185] read(6,
> "\202\33\311\312\0\0\374\0\0\3\272bBB02\0\0\"\217L\276\241<\0\t\256\307\377\377\377\374"...,
> 64512) = 64512
> [pid  4185] read(6,
> "\32\16\7l\0\0\374\0\0\t\222\276BB02\0\0\"qL\276\241<\0!RJ\377\377\377\374"...,
> 64512) = 64512
> [pid  4185] read(6,
> "e\272\t\211\0\0\374\0\0\4\240\264BB02\0\0\"\211L\276\241<\0\r\301:\377\377\377\374"...,
> 64512) = 64512
> etc.

I can't help with that. :)

-- 
Dan Langille -- http://langille.org/


------------------------------------------------------------------------------
Learn how Oracle Real Application Clusters (RAC) One Node allows customers
to consolidate database storage, standardize their database environment, and, 
should the need arise, upgrade to a full multi-node Oracle RAC database 
without downtime or disruption
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users