Bacula-users

Re: [Bacula-users] Extremely Slow Performance on VMware

2012-10-03 13:26:26
Subject: Re: [Bacula-users] Extremely Slow Performance on VMware
From: Rodrigo Abrantes Antunes <rodrigoantunes AT pelotas.ifsul.edu DOT br>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 03 Oct 2012 14:22:25 -0300

Citando Geert Stappers <Geert.Stappers AT vanadgroup DOT com>:

Op 20120928 om 20:38 schreef Rodrigo Abrantes Antunes:

Citando Rodrigo Abrantes Antunes <rodrigoantunes AT pelotas.ifsul.edu DOT br>:
Citando Geert Stappers <Geert.Stappers AT vanadgroup DOT com>:
Rodrigo Abrantes Antunes:
> > Director: 5.0.1-1ubuntu1
> > Storage: 5.0.1-1ubuntu1
> > FD: 5.0.1-1ubuntu1 (some clients have lower version)
> > Database: mysqI
> > OS: Ubuntu 10.04.4 x64 Server
> > FC Storage 4 GBits/s.
> > All my network is Gigabit Ethernet.
>
> Yes, and how is the further design?
>
> In others words: The provided list can read as
> One physical computer with fibre channel disk hosts all the VMs.
} One physical computer, with fibre channel disk, hosts all the VMs.
> If it is so, then tell so. Otherwise eloborate the setup, the design.
>
> Back to
>
> > During a backup I can see bacula-sd using 100% cpu,
>
> And where did you see the "100%"? ( Which tool was used to read that
> performance valule? )
>
> I would like to see the output of
>
>     vmstat 2 5
>
> during non-back-up-time and also the output of
>
>     vmstat 2 5
>
> during back-up-time. Thing I'm interrested in, are the CPU columns.
> Especial the colums "system" and "wait".
>
> <screenshot>
> $ vmstat 2 3
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
> r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
> 0  0   6712  11360 192584 159120    0    0     5     4    6   11  5  6 90  0
> 0  0   6712  11344 192584 159120    0    0     0     0   67  342  7 15 79  0
> 0  0   6712  11344 192584 159120    0    0     0     0   65  340  8 14 78  0
> </screenshot>
>
>
> And to avoid an extra e-mail exchange:
> I'm asking for 2 to the power 3, so 8 measurements.
>
> So 2 moments (during backup or outside backup)
> on 2 Bacula compoments ( storage deamon and file deamon ) on the VMs
> on 2 physical hosts.
>
> Yes, that means that I assume the VMware hosts have a 'vmstat' command.
> That is because I'm not familair with VMware, I'm from the Xen world :-)



I have a physical machine that is a Vmware ESX node wich hosts only one vm,
the one with bacula-director, bacula-sd and bacula-fd (called
bacula-server), this vm has an RDM with the fibre channel storage where the
volumes partition is mounted. Then I have all my clients (some are physycal
machines and others are vms in other ESX nodes) with bacula-fd that are
backed up. When I manually run a job to backup one of these clients in
bacula-server I can see (with the command htop) that bacula-sd is using
100% of the cpu, I also noted that the backup starts at around 4MB/s and

What I see for the 'htop' over here, is that there seen to be information
in the color of cpu usage.

It would interresting to see how 100% CPU usage is divide in system, user
and I/O wait.

after some time it is around 300KB/s. If I simple send the same files to be
backed up with scp for example the transfer goes around 100MB/s. The vms
don't have vmstat, I use linux own commands.

AFAIK is 'vmstat' default installed on every Linux and Unix system.

One thing I noted now, in the vm htop says that 100% cpu is used and actually
the machine is very slow when backing up so I think this value is accurate but
in VSphere Client in the performance chart it says that the vm is using only
400Mhz of the 5000Mhz that were allocated, but the node cpu usage is low so I
don't know why it isn't aloccating more MHZ to the bacula-server.

How VMware allocates CPU cycles to VMs is beyond my current knowledge
(and off-topic on the bacula user mailinglist )

I installed vmstat, I can't do vmstat during non backup time because it is
currently backing up my mail server, about 200gb, it is doing this for almost
15h:

vmstat 2 5 during backing-up on the bacula-server (director, storagedaemon)

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
3  0      0  16576  15592 1763272    0    0    10    55    2    7  0  8 92  0
1  0      0  15916  15592 1763964    0    0     0     0  204   49  1 50 49  0
2  0      0  16812  15588 1762996    0    0     0     2  172   72  1 78 21  0
2  0      0  17820  15604 1767348    0    0     0    18  193  114  3 66 32  0
1  0      0  16296  15604 1769924    0    0     0     0  277   29  0 55 45  0

That is _not_ 100% CPU usage, there was at least 20% idle time.

vmstat 2 5 during backing-up on the physical mail-server (filedaemon)

procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
0  0  16824  38496 143856 5019424    0    0    45    42    3    4  1  0 98  1
0  0  16824  30928 143868 5020004    0    0   336   434 1328  981  1  1 96  2
2  0  16824  35844 143868 5020768    0    0   458   116 1121  635  0  1 99  0
0  0  16824  32180 143876 5024996    0    0  2060   116 1861  686  0  1 97  2
0  0  16824  30192 143912 5026304    0    0   640   521 1505 1032  1  1 86 13

That is even further from 100% CPU usage. The 13% waiting for IO is still
far from the IOwait time I was expecting.


I think it is an interresting problem,
luckly I have allready interresting challenges.


Good luck
Stappers
------------------------------------------------------------------------------
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Bacula-users mailing list
[email protected].nethttps://lists.sourceforge.net/lists/listinfo/bacula-users

Hi, I reinstalled the OS, now it is ubuntu 12.04 with director,sd and fd version 5.2.5. Now backups goes up to 50MB/s, with an average of 20MB/s. That's good enough for me, but now I just want to do some tunning.
Is that JobId_2 right?  Here http://wiki.neuralbs.com/index.php/Bacula the "show index from File" is different with only four indexes.  An the other indexes is everything right? 

I heard that spooling can improve performance too, is it right for file media type? How can I implement it using file media type or where can I find a good documentation about it?


show index from File; 
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| File  |          0 | PRIMARY  |            1 | FileId      | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| File  |          1 | JobId    |            1 | JobId       | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| File  |          1 | JobId_2  |            1 | JobId       | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| File  |          1 | JobId_2  |            2 | PathId      | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
| File  |          1 | JobId_2  |            3 | FilenameId  | A         |           0 |     NULL | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

show index from Filename;
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table    | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Filename |          0 | PRIMARY  |            1 | FilenameId  | A         |        1296 |     NULL | NULL   |      | BTREE      |         |               |
| Filename |          1 | Name     |            1 | Name        | A         |        1296 |      255 | NULL   |      | BTREE      |         |               |
+----------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+

show index from Path;
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
| Path  |          0 | PRIMARY  |            1 | PathId      | A         |         172 |     NULL | NULL   |      | BTREE      |         |               |
| Path  |          1 | Path     |            1 | Path        | A         |         172 |      255 | NULL   |      | BTREE      |         |               |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+
------------------------------------------------------------------------------
Don't let slow site performance ruin your business. Deploy New Relic APM
Deploy New Relic app performance management and know exactly
what is happening inside your Ruby, Python, PHP, Java, and .NET app
Try New Relic at no cost today and get our sweet Data Nerd shirt too!
http://p.sf.net/sfu/newrelic-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>