Bacula-users

Re: [Bacula-users] [Bacula-devel] Despooling attrs does not finish

2014-10-21 11:17:16
Subject: Re: [Bacula-users] [Bacula-devel] Despooling attrs does not finish
From: Blake Dunlap <ikiris AT gmail DOT com>
To: Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
Date: Tue, 21 Oct 2014 10:10:55 -0500
This sounds like a bug in bacula actually. It shouldn't follow
recursion into the same structure, simply store the link and move on.

-Blake

On Tue, Oct 21, 2014 at 4:17 AM, Ulrich Leodolter
<ulrich.leodolter AT obvsg DOT at> wrote:
> Hi all,
>
> i found the root cause of the problem, it was simply a mysql performance
> problem because auf special filesystem hierarchy on the users desktop.
>
> there was one directory which was recursively repeated inside itself.
>
> C:/Users/name/Desktop/Exercise Files/CSS Core Concepts
> C:/Users/name/Desktop/Exercise Files/CSS Core Concepts/Exercise Files/CSS 
> Core Concepts/
> ...
>
> there was only one file inside "CSS Core Concepts" and six empty sub 
> directories
> Chapter_01 to Chapter_06.  this hierarchy was repeated up to path length of 
> 4834.
> very strange, maybe a zip file containing symlinks pointing to .  was 
> unzipped on desktop.
>
> the join on path in the batch insert seems to perform very badly comparing 
> about 27000
> long path names like that.
>
> INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5, DeltaSeq)
>   SELECT batch.FileIndex, batch.JobId, Path.PathId, Filename.FilenameId, 
> batch.LStat, batch.MD5, batch.DeltaSeq FROM batch
>    JOIN Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name = 
> Filename.Name)
>
>
> now we removed the almost empty tree "C:/Users/name/Desktop/Exercise Files"
> and i am sure tomorrow the backup will finish in time without problems.
>
> Best regards
> Ulrich
>
>
>
>
> On Mon, 2014-10-20 at 15:21 +0200, alejandro alfonso fernandez wrote:
>> Hi!
>>
>> I agree with Martin. It's no a Bacula error, it's a MySQL problem
>>
>> I'm pretty sure that your /tmp partition becomes full, specially if you
>> share both Bacula Spool Directory (are you using "SpoolData = yes"?) and
>> MySQL temporary filesystem (both of them in /tmp by default)
>>
>> Try changing the "tmpdir" param of your MySQL server (my.cnf) to a bigger
>> partition (don't forget restart the service to commit the change)
>>
>> Example:
>> # point the following paths to different dedicated disks
>> # tmpdir                                                = /tmp/
>> tmpdir                                          = /var/tmp/mysql
>>
>> Doing a "mysqlrepair" to test database integrity will be a good idea
>>
>> Best regards!
>>
>> On Mon, Oct 20, 2014 at 12:57 PM, Martin Simmons <martin AT lispworks DOT 
>> com>
>> wrote:
>>
>> > >>>>> On Sun, 19 Oct 2014 19:02:57 +0200, Ulrich Leodolter said:
>> > >
>> > > Hello Dan,
>> > >
>> > > On Sat, 2014-10-18 at 13:32 -0400, Dan Langille wrote:
>> > > > On Oct 18, 2014,
>> > > > at 4:03 AM, Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at> wrote:
>> > > >
>> > > > > Hello,
>> > > > >
>> > > > > we have Win7 backup which does not come to an end within
>> > MaxRunTime=12h.
>> > > > > server runs 7.0.5 (28 July 2014),  the client has installed the
>> > > > > bacula-enterprise-win64-7.0.5.exe.  but the problem started about 2
>> > > > > months ago,  at that time windows client 5.2.10 was installed on the
>> > > > > machine.
>> > > > >
>> > > > > the backup itself is about 100GB compressed and seems to finish
>> > > > > on the client after about 6 hours, below are the last messages of
>> > > > > the job before it gets stuck.
>> > > > >
>> > > > > 2014-10-18 03:18:09 troll-sd JobId 635821: Committing spooled data to
>> > > > > Volume "Backup-0779". Despooling 1,692,736,419 bytes ...
>> > > > > 2014-10-18 03:18:18 troll-sd JobId 635821: Despooling elapsed time =
>> > 00:00:09, Transfer rate = 188.0 M Bytes/second
>> > > > > 2014-10-18 03:18:19 troll-sd JobId 635821: Elapsed time=06:11:45,
>> > Transfer rate=4.691 M Bytes/second
>> > > > > 2014-10-18 03:18:22 troll-sd JobId 635821: Sending spooled attrs to
>> > the Director. Despooling 603,449,667 bytes .
>> > > > >
>> > > > > mysql status at the same time:
>> > > > >
>> > > > > # echo "show full processlist" | mysql
>> > > > > Id        User    Host    db      Command Time    State   Info
>> > > > > 6854      bacula  localhost       bacula  Sleep   522
>> >  NULL
>> > > > > 6873      bacula  localhost       bacula  Query   21143   Sending
>> > data    INSERT INTO File (FileIndex, JobId, PathId, FilenameId, LStat, MD5,
>> > DeltaSeq) SELECT batch.FileIndex, batch.JobId, Path.PathId,
>> > Filename.FilenameId,batch.LStat, batch.MD5, batch.DeltaSeq FROM batch JOIN
>> > Path ON (batch.Path = Path.Path) JOIN Filename ON (batch.Name =
>> > Filename.Name)
>> > > > > 6899      root    localhost       NULL    Query   0       NULL
>> > show full processlist
>> > > > >
>> > > > >
>> > > > > we have a bunch of other clients (about 30), a mixture of linux,
>> > win7 and mac powerpc.
>> > > > > all other backups run without problems for years now.  there are
>> > even larger backups,
>> > > > > in size and in number of files.
>> > > > >
>> > > > >
>> > > > > does anyone have an idea why this single batch insert does not
>> > complete?
>> > > > >
>> > > > > do i need to analyze the attrs spool file itself ?
>> > > > >
>> > > > > yesterday i optimized the bacula database, but that doesn't help.
>> > > > > there must be something special in the attrs spool file which the
>> > > > > mysql server can't handle. the server runs on standard CentOS 6.5
>> > x86_64.
>> > > >
>> > > > This is something which should be asked in the user mailing list, not
>> > the devel mailing list.  I am replying to that list instead.
>> > > >
>> > >
>> > > Ok
>> > >
>> > > > Is this a large number of files?
>> > > >
>> > >
>> > > about 750000,  not very large.
>> > >
>> > > > I had something which took a while.  I sped it up by giving PostgreSQL
>> > more memory.  Perhaps MySQL can do the same.
>> > > >
>> > > > Here’s what I did:
>> > https://plus.google.com/+DanLangille/posts/AKXoRido3U1
>> > > >
>> > >
>> > > the mysql database is already optimized and has enough memory.
>> > > backups up 4000000 files and 700GB run without problems.
>> > >
>> > > below are last job messages of the last failed jobs, it was canceled
>> > > after 12 hours max run time.
>> > >
>> > > 2014-10-19 02:49:15 troll-sd JobId 635915: Elapsed time=05:43:32,
>> > Transfer rate=5.095 M Bytes/second
>> > > 2014-10-19 02:49:16 troll-sd JobId 635915: Sending spooled attrs to the
>> > Director. Despooling 603,418,522 bytes ...
>> > > 2014-10-19 09:05:43 troll-dir JobId 635915: Fatal error: Max run time
>> > exceeded. Job canceled.
>> > > 2014-10-19 09:41:17 troll-dir JobId 635915: Error: Bacula troll-dir
>> > 7.0.5 (28Jul14):
>> > >
>> > > the database shows already about 600k files for this jobs.
>> > >
>> > > mysql> select count(*) from File where JobId = '635915' and LStat is not
>> > > null;
>> > > +----------+
>> > > | count(*) |
>> > > +----------+
>> > > |   616848 |
>> > > +----------+
>> > > 1 row in set (0.00 sec)
>> > >
>> > >
>> > > is it possible that some special file attr brings the mysql datebase
>> > > into troubles ?  i really cant imagine.  the mysql database is almost
>> > > idle after first 600k have been inserted.  there is no io traffic and
>> > > cpu usage is also low.
>> > >
>> > > it seems i have to dump spooled attrs files tomorrow and compare to
>> > > database to see at which attrs have been inserted and at which one it
>> > > stops.
>> >
>> > I think that will be difficult, because the insert is a join of various
>> > other
>> > temporary tables so the order may be random.
>> >
>> >
>> > > has anyone a better idea how to debug this problem?
>> > > i am little bit lost :) because in the last 6 years since
>> > > i am using bacula i never run into a problem like this.
>> >
>> > Maybe you can attach strace or gdb to the mysql process running this insert
>> > statement to see what it is doing?  It doesn't look like a Bacula problem,
>> > but
>> > it might be a bug in mysql.
>> >
>> > __Martin
>> >
>> >
>> > ------------------------------------------------------------------------------
>> > Comprehensive Server Monitoring with Site24x7.
>> > Monitor 10 servers for $9/Month.
>> > Get alerted through email, SMS, voice calls or mobile push notifications.
>> > Take corrective actions from your mobile device.
>> > http://p.sf.net/sfu/Zoho
>> > _______________________________________________
>> > Bacula-users mailing list
>> > Bacula-users AT lists.sourceforge DOT net
>> > https://lists.sourceforge.net/lists/listinfo/bacula-users
>> >
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users