Bacula-users

Re: [Bacula-users] [Bacula-devel] Despooling attrs does not finish

2014-10-21 05:22:32
Subject: Re: [Bacula-users] [Bacula-devel] Despooling attrs does not finish
From: Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
To: bacula-users AT lists.sourceforge DOT net
Date: Tue, 21 Oct 2014 11:17:31 +0200
Hi all,

i found the root cause of the problem, it was simply a mysql performance
problem because auf special filesystem hierarchy on the users desktop.

there was one directory which was recursively repeated inside itself.

C:/Users/name/Desktop/Exercise Files/CSS Core Concepts
C:/Users/name/Desktop/Exercise Files/CSS Core Concepts/Exercise Files/CSS Core 
Concepts/
...

there was only one file inside "CSS Core Concepts" and six empty sub directories
Chapter_01 to Chapter_06.  this hierarchy was repeated up to path length of 
4834.
very strange, maybe a zip file containing symlinks pointing to .  was unzipped 
on desktop.

the join on path in the batch insert seems to perform very badly comparing 
about 27000
long path names like that.

INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, DeltaSeq)
  SELECT batch.FileIndex, batch.JobId, Path.PathId, Filename.FilenameId, 
batch.LStat, batch.MD5, batch.DeltaSeq FROM batch
   JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = 
Filename.Name)


now we removed the almost empty tree "C:/Users/name/Desktop/Exercise Files"
and i am sure tomorrow the backup will finish in time without problems.

Best regards
Ulrich




On Mon, 2014-10-20 at 15:21 +0200, alejandro alfonso fernandez wrote:
> Hi!
> 
> I agree with Martin. It's no a Bacula error, it's a MySQL problem
> 
> I'm pretty sure that your /tmp partition becomes full, specially if you
> share both Bacula Spool Directory (are you using "SpoolData = yes"?) and
> MySQL temporary filesystem (both of them in /tmp by default)
> 
> Try changing the "tmpdir" param of your MySQL server (my.cnf) to a bigger
> partition (don't forget restart the service to commit the change)
> 
> Example:
> # point the following paths to different dedicated disks
> # tmpdir                                                = /tmp/
> tmpdir                                          = /var/tmp/mysql
> 
> Doing a "mysqlrepair" to test database integrity will be a good idea
> 
> Best regards!
> 
> On Mon, Oct 20, 2014 at 12:57 PM, Martin Simmons <martin AT lispworks DOT com>
> wrote:
> 
> > >>>>> On Sun, 19 Oct 2014 19:02:57 +0200, Ulrich Leodolter said:
> > >
> > > Hello Dan,
> > >
> > > On Sat, 2014-10-18 at 13:32 -0400, Dan Langille wrote:
> > > > On Oct 18, 2014,
> > > > at 4:03 AM, Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > we have Win7 backup which does not come to an end within
> > MaxRunTime=12h.
> > > > > server runs 7.0.5 (28 July 2014),  the client has installed the
> > > > > bacula-enterprise-win64-7.0.5.exe.  but the problem started about 2
> > > > > months ago,  at that time windows client 5.2.10 was installed on the
> > > > > machine.
> > > > >
> > > > > the backup itself is about 100GB compressed and seems to finish
> > > > > on the client after about 6 hours, below are the last messages of
> > > > > the job before it gets stuck.
> > > > >
> > > > > 2014-10-18 03:18:09 troll-sd JobId 635821: Committing spooled data to
> > > > > Volume "Backup-0779". Despooling 1,692,736,419 bytes ...
> > > > > 2014-10-18 03:18:18 troll-sd JobId 635821: Despooling elapsed time =
> > 00:00:09, Transfer rate = 188.0 M Bytes/second
> > > > > 2014-10-18 03:18:19 troll-sd JobId 635821: Elapsed time=06:11:45,
> > Transfer rate=4.691 M Bytes/second
> > > > > 2014-10-18 03:18:22 troll-sd JobId 635821: Sending spooled attrs to
> > the Director. Despooling 603,449,667 bytes .
> > > > >
> > > > > mysql status at the same time:
> > > > >
> > > > > # echo "show full processlist" | mysql
> > > > > Id        User    Host    db      Command Time    State   Info
> > > > > 6854      bacula  localhost       bacula  Sleep   522
> >  NULL
> > > > > 6873      bacula  localhost       bacula  Query   21143   Sending
> > data    INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5,
> > DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId,
> > Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch JOIN
> > Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name =
> > Filename.Name)
> > > > > 6899      root    localhost       NULL    Query   0       NULL
> > show full processlist
> > > > >
> > > > >
> > > > > we have a bunch of other clients (about 30), a mixture of linux,
> > win7 and mac powerpc.
> > > > > all other backups run without problems for years now.  there are
> > even larger backups,
> > > > > in size and in number of files.
> > > > >
> > > > >
> > > > > does anyone have an idea why this single batch insert does not
> > complete?
> > > > >
> > > > > do i need to analyze the attrs spool file itself ?
> > > > >
> > > > > yesterday i optimized the bacula database, but that doesn't help.
> > > > > there must be something special in the attrs spool file which the
> > > > > mysql server can't handle. the server runs on standard CentOS 6.5
> > x86_64.
> > > >
> > > > This is something which should be asked in the user mailing list, not
> > the devel mailing list.  I am replying to that list instead.
> > > >
> > >
> > > Ok
> > >
> > > > Is this a large number of files?
> > > >
> > >
> > > about 750000,  not very large.
> > >
> > > > I had something which took a while.  I sped it up by giving PostgreSQL
> > more memory.  Perhaps MySQL can do the same.
> > > >
> > > > Here’s what I did:
> > https://plus.google.com/+DanLangille/posts/AKXoRido3U1
> > > >
> > >
> > > the mysql database is already optimized and has enough memory.
> > > backups up 4000000 files and 700GB run without problems.
> > >
> > > below are last job messages of the last failed jobs, it was canceled
> > > after 12 hours max run time.
> > >
> > > 2014-10-19 02:49:15 troll-sd JobId 635915: Elapsed time=05:43:32,
> > Transfer rate=5.095 M Bytes/second
> > > 2014-10-19 02:49:16 troll-sd JobId 635915: Sending spooled attrs to the
> > Director. Despooling 603,418,522 bytes ...
> > > 2014-10-19 09:05:43 troll-dir JobId 635915: Fatal error: Max run time
> > exceeded. Job canceled.
> > > 2014-10-19 09:41:17 troll-dir JobId 635915: Error: Bacula troll-dir
> > 7.0.5 (28Jul14):
> > >
> > > the database shows already about 600k files for this jobs.
> > >
> > > mysql> select count(*) from File where JobId = '635915' and LStat is not
> > > null;
> > > +----------+
> > > | count(*) |
> > > +----------+
> > > |   616848 |
> > > +----------+
> > > 1 row in set (0.00 sec)
> > >
> > >
> > > is it possible that some special file attr brings the mysql datebase
> > > into troubles ?  i really cant imagine.  the mysql database is almost
> > > idle after first 600k have been inserted.  there is no io traffic and
> > > cpu usage is also low.
> > >
> > > it seems i have to dump spooled attrs files tomorrow and compare to
> > > database to see at which attrs have been inserted and at which one it
> > > stops.
> >
> > I think that will be difficult, because the insert is a join of various
> > other
> > temporary tables so the order may be random.
> >
> >
> > > has anyone a better idea how to debug this problem?
> > > i am little bit lost :) because in the last 6 years since
> > > i am using bacula i never run into a problem like this.
> >
> > Maybe you can attach strace or gdb to the mysql process running this insert
> > statement to see what it is doing?  It doesn't look like a Bacula problem,
> > but
> > it might be a bug in mysql.
> >
> > __Martin
> >
> >
> > ------------------------------------------------------------------------------
> > Comprehensive Server Monitoring with Site24x7.
> > Monitor 10 servers for $9/Month.
> > Get alerted through email, SMS, voice calls or mobile push notifications.
> > Take corrective actions from your mobile device.
> > http://p.sf.net/sfu/Zoho
> > _______________________________________________
> > Bacula-users mailing list
> > Bacula-users AT lists.sourceforge DOT net
> > https://lists.sourceforge.net/lists/listinfo/bacula-users
> >



------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users