Bacula-users

Re: [Bacula-users] Extremely Slow Performance on VMware

2012-09-29 07:45:02
Subject: Re: [Bacula-users] Extremely Slow Performance on VMware
From: Geert Stappers <Geert.Stappers AT vanadgroup DOT com>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Sat, 29 Sep 2012 13:44:06 +0200
Op 20120928 om 20:38 schreef Rodrigo Abrantes Antunes:
> Citando Rodrigo Abrantes Antunes <rodrigoantunes AT pelotas.ifsul.edu DOT br>:
> > Citando Geert Stappers <Geert.Stappers AT vanadgroup DOT com>:
> > Rodrigo Abrantes Antunes:
> > > > Director: 5.0.1-1ubuntu1
> > > > Storage: 5.0.1-1ubuntu1
> > > > FD: 5.0.1-1ubuntu1 (some clients have lower version)
> > > > Database: mysqI
> > > > OS: Ubuntu 10.04.4 x64 Server
> > > > FC Storage 4 GBits/s.
> > > > All my network is Gigabit Ethernet.
> > > 
> > > Yes, and how is the further design?
> > > 
> > > In others words: The provided list can read as
> > > One physical computer with fibre channel disk hosts all the VMs.
> > } One physical computer, with fibre channel disk, hosts all the VMs.
> > > If it is so, then tell so. Otherwise eloborate the setup, the design.
> > > 
> > > Back to
> > > 
> > > > During a backup I can see bacula-sd using 100% cpu,
> > > 
> > > And where did you see the "100%"? ( Which tool was used to read that
> > > performance valule? )
> > > 
> > > I would like to see the output of
> > > 
> > >     vmstat 2 5
> > > 
> > > during non-back-up-time and also the output of
> > > 
> > >     vmstat 2 5
> > > 
> > > during back-up-time. Thing I'm interrested in, are the CPU columns.
> > > Especial the colums "system" and "wait".
> > > 
> > > <screenshot>
> > > $ vmstat 2 3
> > > procs -----------memory---------- ---swap-- -----io---- -system-- 
> > > ----cpu----
> > > r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id 
> > > wa
> > > 0  0   6712  11360 192584 159120    0    0     5     4    6   11  5  6 90 
> > >  0
> > > 0  0   6712  11344 192584 159120    0    0     0     0   67  342  7 15 79 
> > >  0
> > > 0  0   6712  11344 192584 159120    0    0     0     0   65  340  8 14 78 
> > >  0
> > > </screenshot>
> > > 
> > > 
> > > And to avoid an extra e-mail exchange:
> > > I'm asking for 2 to the power 3, so 8 measurements.
> > > 
> > > So 2 moments (during backup or outside backup)
> > > on 2 Bacula compoments ( storage deamon and file deamon ) on the VMs
> > > on 2 physical hosts.
> > > 
> > > Yes, that means that I assume the VMware hosts have a 'vmstat' command.
> > > That is because I'm not familair with VMware, I'm from the Xen world :-)
> > 
> > 
> > 
> > I have a physical machine that is a Vmware ESX node wich hosts only one vm,
> > the one with bacula-director, bacula-sd and bacula-fd (called
> > bacula-server), this vm has an RDM with the fibre channel storage where the
> > volumes partition is mounted. Then I have all my clients (some are physycal
> > machines and others are vms in other ESX nodes) with bacula-fd that are
> > backed up. When I manually run a job to backup one of these clients in
> > bacula-server I can see (with the command htop) that bacula-sd is using
> > 100% of the cpu, I also noted that the backup starts at around 4MB/s and

What I see for the 'htop' over here, is that there seen to be information
in the color of cpu usage.

It would interresting to see how 100% CPU usage is divide in system, user
and I/O wait.

> > after some time it is around 300KB/s. If I simple send the same files to be
> > backed up with scp for example the transfer goes around 100MB/s. The vms
> > don't have vmstat, I use linux own commands.

AFAIK is 'vmstat' default installed on every Linux and Unix system.


> One thing I noted now, in the vm htop says that 100% cpu is used and actually
> the machine is very slow when backing up so I think this value is accurate but
> in VSphere Client in the performance chart it says that the vm is using only
> 400Mhz of the 5000Mhz that were allocated, but the node cpu usage is low so I
> don't know why it isn't aloccating more MHZ to the bacula-server.

How VMware allocates CPU cycles to VMs is beyond my current knowledge
(and off-topic on the bacula user mailinglist )


> I installed vmstat, I can't do vmstat during non backup time because it is
> currently backing up my mail server, about 200gb, it is doing this for almost
> 15h:
> 
> vmstat 2 5 during backing-up on the bacula-server (director, storagedaemon)
> 
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  3  0      0  16576  15592 1763272    0    0    10    55    2    7  0  8 92  0
>  1  0      0  15916  15592 1763964    0    0     0     0  204   49  1 50 49  0
>  2  0      0  16812  15588 1762996    0    0     0     2  172   72  1 78 21  0
>  2  0      0  17820  15604 1767348    0    0     0    18  193  114  3 66 32  0
>  1  0      0  16296  15604 1769924    0    0     0     0  277   29  0 55 45  0

That is _not_ 100% CPU usage, there was at least 20% idle time.


> vmstat 2 5 during backing-up on the physical mail-server (filedaemon)
> 
> procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu----
>  r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa
>  0  0  16824  38496 143856 5019424    0    0    45    42    3    4  1  0 98  1
>  0  0  16824  30928 143868 5020004    0    0   336   434 1328  981  1  1 96  2
>  2  0  16824  35844 143868 5020768    0    0   458   116 1121  635  0  1 99  0
>  0  0  16824  32180 143876 5024996    0    0  2060   116 1861  686  0  1 97  2
>  0  0  16824  30192 143912 5026304    0    0   640   521 1505 1032  1  1 86 13

That is even further from 100% CPU usage. The 13% waiting for IO is still
far from the IOwait time I was expecting.


I think it is an interresting problem,
luckly I have allready interresting challenges.


Good luck
Stappers
------------------------------------------------------------------------------
How fast is your code?
3 out of 4 devs don\\\'t know how their code performs in production.
Find out how slow your code is with AppDynamics Lite.
http://ad.doubleclick.net/clk;262219672;13503038;z?
http://info.appdynamics.com/FreeJavaPerformanceDownload.html
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users