Bacula-users

Re: [Bacula-users] Strange problem with Bacula

2009-10-02 14:20:39
Subject: Re: [Bacula-users] Strange problem with Bacula
From: Bruno Friedmann <bruno AT ioda-net DOT ch>
Date: Fri, 02 Oct 2009 20:14:55 +0200
Reynier Pérez Mira wrote:
> Bruno Friedmann wrote:
>> Cedric Tefft wrote:
>>
>> All of this come due to the use of batch-insert option ( compile option )
>> So to "accelerate" the insert of records into the db, bacula-dir write
>> each record during the backup
>> into Batch temp table. After all records are written to the sd, dir
>> try to make one big transaction with the db.
> 
> How I can do this? I need to reconfigure Bacula again? Or just changing
> a configuration parameter?

No unfortunately this is made at compilation configuration.
Don't know what type of package are you using or personal compilation ?

Really I would give it a try.

> 
>> So during the backup this file is written on disk normally into /tmp
>> perharps elsewhere /var/lib/bacula
>> don't know exactly where.
>> As your postgresql claim to not have suffisant space for hash-join
>> temporary file you should also check
>> where postgresql is working /var/lib/pgsql under openSUSE for example ...
> 
> The problem is not the space left on the device until it haves 5 GB and
> I don't think that this file takes all this space.
> 
> Today I'm experimenting the same problem again but note that the same
> client do backups before. See the logs:
> 
> 02-Oct 03:25 serverbacula-dir JobId 2732: Job
> SP_ERP-FD.2009-10-02_03.25.15_40 waiting 1800 seconds for scheduled
> start time.
> 02-Oct 03:57 serverbacula-dir JobId 2732: Start Backup JobId 2732,
> Job=SP_ERP-FD.2009-10-02_03.25.15_40
> 02-Oct 03:57 serverbacula-dir JobId 2732: Using Device "FileStorage"

> 01-Oct 21:55 salvasprod_erp-fd JobId 2732: Warning: DIR and FD clocks
> differ by -21697 seconds, FD automatically compensating.

You should correct this, ( not for bacula, but I think it's more sane to have 
computer set at correct time )

> 02-Oct 03:57 FileSAN JobId 2732: Volume "SP_ERP_Pool-0081" previously
> written, moving to end of data.
> 02-Oct 03:57 FileSAN JobId 2732: Ready to append to end of Volume
> "SP_ERP_Pool-0081" size=157019312317
> 02-Oct 05:18 FileSAN JobId 2732: Job write elapsed time = 01:20:47,
> Transfer rate = 2.733 M bytes/second
> 02-Oct 05:21 serverbacula-dir JobId 2732: Fatal error: sql_create.c:789
> Fill Path table Query failed: INSERT INTO Path (Path) SELECT a.Path FROM
> (SELECT DISTINCT Path FROM batch) AS a WHERE NOT EXISTS (SELECT Path
> FROM Path WHERE Path = a.Path) : ERR=ERROR:  could not write block 30037
> of temporary file: No space left on device
> HINT:  Perhaps out of disk space?
> 
> 
> 02-Oct 05:21 serverbacula-dir JobId 2732: Error: Bacula serverbacula-dir
> 3.0.1 (30Apr09): 02-Oct-2009 05:21:47
>   Build OS:               i686-pc-linux-gnu ubuntu 9.04
>   JobId:                  2732
>   Job:                    SP_ERP-FD.2009-10-02_03.25.15_40
>   Backup Level:           Incremental, since=2009-09-25 02:00:02
>   Client:                 "salvasprod_erp-fd" 2.4.2 (26Jul08)
> i486-pc-linux-gnu,debian,lenny/sid
>   FileSet:                "SP_ERP-FS" 2009-06-30 02:00:00
>   Pool:                   "SP_ERP_Pool" (From Job resource)
>   Catalog:                "MyCatalog" (From Client resource)
>   Storage:                "FileSAN" (From previous Job)
>   Scheduled time:         02-Oct-2009 03:55:15
>   Start time:             02-Oct-2009 03:57:35
>   End time:               02-Oct-2009 05:21:47
>   Elapsed time:           1 hour 24 mins 12 secs
>   Priority:               1
>   FD Files Written:       1,485,967
>   SD Files Written:       1,485,967
>   FD Bytes Written:       12,884,698,790 (12.88 GB)
>   SD Bytes Written:       13,248,509,521 (13.24 GB)
>   Rate:                   2550.4 KB/s
>   Software Compression:   42.9 %
>   VSS:                    no
>   Encryption:             no
>   Accurate:               no
>   Volume name(s):         SP_ERP_Pool-0081
>   Volume Session Id:      76
>   Volume Session Time:    1254340514
>   Last Volume Bytes:      170,315,546,746 (170.3 GB)
>   Non-fatal FD errors:    0
>   SD Errors:              0
>   FD termination status:  OK
>   SD termination status:  OK
>   Termination:            *** Backup Error ***
> 
> 02-Oct 05:21 serverbacula-dir JobId 2732: Begin pruning Jobs.
> 02-Oct 05:21 serverbacula-dir JobId 2732: No Jobs found to prune.
> 02-Oct 05:21 serverbacula-dir JobId 2732: Begin pruning Files.
> 02-Oct 05:21 serverbacula-dir JobId 2732: No Files found to prune.
> 02-Oct 05:21 serverbacula-dir JobId 2732: End auto prune.
> 
> 02-Oct 05:21 serverbacula-dir JobId 2732: Rescheduled Job
> SP_ERP-FD.2009-10-02_03.25.15_40 at 02-Oct-2009 05:21 to re-run in 1800
> seconds (02-Oct-2009 05:51).
> 
> So really I can't find where the problem is. Some days it works fine
> some no. Can any help me on this?
> 
> Cheers and thanks in advance
> Reynier

You should have a sort of graph ( plot data ) from a cron script running every 
minute
doing a df to see which disk are full.

Another thing I don't understand in your system is the following
   FD Files Written:       1,485,967
   SD Files Written:       1,485,967
   FD Bytes Written:       12,884,698,790 (12.88 GB)
   SD Bytes Written:       13,248,509,521 (13.24 GB)
   Software Compression:   42.9 %
   Last Volume Bytes:      170,315,546,746 (170.3 GB)
Last line : how can this be possible ? it's 11 times the initial backup !

At customers site I get more normal result also 3.0.2 version with postgresql

  FD Files Written:       290,332
  SD Files Written:       290,332
  FD Bytes Written:       118,522,524,993 (118.5 GB)
  SD Bytes Written:       118,566,289,033 (118.5 GB)
  Rate:                   17468.3 KB/s
  Software Compression:   13.8 %
  Volume name(s):         MDAY-VENDREDI
  Volume Session Id:      95
  Volume Session Time:    1251132153
  Last Volume Bytes:      118,667,465,810 (118.6 GB)

And every jobs look like this.
Forget this, I'm purging and recycling volume, You are adding 13,24GB to a 
154GB volume so everything normal here.

Really I've no idea more than sitting in front of some tools, to check what's 
going wrong live.
Try to raise the postgresql log (during backup) to see if you get some clue 
from it.

Have nice week-end in any case.

-- 

     Bruno Friedmann


------------------------------------------------------------------------------
Come build with us! The BlackBerry&reg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9&#45;12, 2009. Register now&#33;
http://p.sf.net/sfu/devconf
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>