Bacula-users

Re: [Bacula-users] backup slowdown (mysqld) after tape autochange

2010-12-14 12:23:48
Subject: Re: [Bacula-users] backup slowdown (mysqld) after tape autochange
From: "Dan Langille" <dan AT langille DOT org>
To: "Robert Wirth" <Robert.Wirth AT dfki DOT de>
Date: Tue, 14 Dec 2010 12:21:12 -0500
On Tue, December 14, 2010 11:48 am, Robert Wirth wrote:
> Hi,
>
> strange problem.  Here's some hardware where Bacula has been running
> successfully for ca. 5 years.  It was release 1.38.11 under Solaris 10x86.
>
> Last month, we had a system disk crash on the backup system.  No backup
> datas have been lost.  We just had to reinstall the backup system.
> Since this was our only Solaris x86 system, we decided to migrate
> to Linux and to a newer Bacula release.  Until the repaired hardware
> was present, we started with a virtualized new system, just for the
> daily incremental backups to disk volumes.
>
> Since most of our actual systems are Ubuntu Hardy server LTS, we
> choosed Bacula 2.2.8 of this distribution as our new version (well,
> it's old, but 1.38.11 was running well, and 2.2.8 was the default)
>
> We upgraded Bacula's mysql database with the corresponding script
> from 1.38.11 to 2.2.8.  We imported the updated DB using mysql_dump
> into the new system which has MySQL 5.1.41 and Linux Kernel 2.6.32
> The virtualized system worked well all the time.
>
> Now, the hardware version of the system is ready, and a yearly full
> backup, which goes directly to tape, is imminent.
>
> And now, the strange things are coming...
>
>
> /* The system is a 2x2 core AMD Opteron system, 4 GB RAM, 6xLSI SCSI U320
> Megaraid with seperated channels for external disks, tape readers and
> autochanger.  23 TB disk storage on external RAIDs, autochanger and
> HP-readers for LTO-3 tapes.   System: see above. */
>
>
> NOW BACKING UP...
>
> Starting a bunch of full backup jobs which fit into 1 SINGLE TAPE
> produces NO PROBLEMS:  the jobs start, run and write, and terminate
> within a usual span of time.  In so doing, I can backup a dozen
> systems with totally 360 GB on one tape in a few hours.
>
>
> FACING THE PROBLEM...
>
> Starting a bunch of full backup jobs that DO NOT FIT into 1 single
> tape proceeds like follows  (with a fresh tape forced by setting the
> former one to readonly):
>
> - first, the jobs run well and write their data to the first fresh tape
>   of the corresponding pool.  Speed is similar as known from the old OS.
>
> - when the tape is full with around 600GB of data, it is marked as
>   Full, being unloaded, and the next free tape of the pool is loaded.
>
> - from this moment on, writing to the new fresh tape becomes incredibly
>   slow (4 GB/hour) and mysqld has constantly 95%-100% CPU load.
>   No other process has an important load, and the mysql load isn't
>   represented in the system's load values:
>
> Cpu(s):  3.3%us,  2.2%sy,  0.0%ni, 91.6%id,  2.1%wa,  0.1%hi,  0.7%si,
> 0.0%st
> Mem:   3961616k total,  3850072k used,   111544k free,    17532k buffers
> Swap:  3906552k total,        0k used,  3906552k free,  3579956k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  1356 mysql     20   0  144m  31m 2376 S   98  0.8 163:57.79 mysqld
>     1 root      20   0  2620  948  528 S    0  0.0   0:00.63 init
>     2 root      20   0     0    0    0 S    0  0.0   0:00.00 kthreadd
>  ....
>
> The only further effect I can see is that the table "bacula.JobMedia" is
> growing.   No errors in system log, no mysql errors, nor in Baculas log.
>
> What I mainly don't understand is why this happens after a tape change.
> The MaxSpoolSize is 32GB, and I'm backing up 7 systems.  Each of them
> had several spool steps during the first tape.
>
>>>From the view of Bacula and its program logic, what has changed when
> the tape has been changed?  I guess it's all the same:  spooling data,
> writing them to tape and update the catalog, regardless of first, second
> or later tape...?!?

What do you see under Running Jobs in the 'status dir' output before and
after the first tape has filled?

If you have only the 'after' just now, that might be interesting.
-- 
Dan Langille -- http://langille.org/


------------------------------------------------------------------------------
Lotusphere 2011
Register now for Lotusphere 2011 and learn how
to connect the dots, take your collaborative environment
to the next level, and enter the era of Social Business.
http://p.sf.net/sfu/lotusphere-d2d
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>