While the job is running, keep an eye on the system which houses
your MySQL database and make sure that it isn't filling up a
partition with temp data. I was running into a similar problem
and needed to move my mysql_tmpdir (definable in /etc/my.cnf)
to another location.
-John
On Wed, Aug 12, 2009 at 05:00:30PM +0100, Nick Lock wrote:
> Hello list!
>
> Sorry to trouble you with what's probably a simple problem, but I'm now
> looking at the very real possibility of wiping all our backups clean and
> starting from scratch if I can't fix it... :(
>
> I'm having problems with some Full backups, which run for between 1 and
> 2 hours, appearing to "time out" after the data transfer from the FD to
> the SD. The error message (shown below) shows that the data transfer
> completes, often in about 1hr30min, and then Bacula does nothing until
> the job has been running for 2 hours at which point it gives an FD
> error.
>
> Other Full backups (which don't take as long) run correctly, and for
> most of the time Inc and Diff backups also run correctly. However, a
> small % of backups will fail at random, also with FD errors but at
> random times-elapsed during the job... this I have been ascribing to
> network fluctuations! The difference is that re-running these random
> failures will succeed, whilst this particular Full failure doesn't! ;)
>
> I've already tried setting a heartbeat interval of 20 minutes in the
> FD/SD and DIR conf files (thinking that the FD -> Dir connection was
> timing out) but this doesn't change anything.
>
> In the time between the data transfer finishing and the timeout,
> Postgres has an open connection with a "COPY batch FROM STDIN"
> transaction in progress, which at the timeout produces errors in the
> Postgres log that I have also shown below.
>
> I'm happy to post portions of the conf files if needed, but they're huge
> and might well lead to tl;dr!
>
> Any suggestions as to how I can troubleshoot this further would be most
> appreciated!
>
> Nick Lock.
>
>
> ---------------------------------------------------------------------
> 12-Aug 14:18 exa-bacula-dir JobId 5514: Start Backup JobId 5514,
> Job=backup_scavenger.2009-08-12_14.18.06.04
> 12-Aug 14:18 exa-bacula-dir JobId 5514: There are no more Jobs
> associated with Volume "scavenger-full-1250". Marking it purged.
> 12-Aug 14:18 exa-bacula-dir JobId 5514: All records pruned from Volume
> "scavenger-full-1250"; marking it "Purged"
> 12-Aug 14:18 exa-bacula-dir JobId 5514: Recycled volume
> "scavenger-full-1250"
> 12-Aug 14:18 exa-bacula-dir JobId 5514: Using Device
> "FileStorageScavenger"
> 12-Aug 14:18 exa-bacula-sd JobId 5514: Recycled volume
> "scavenger-full-1250" on device
> "FileStorageScavenger" (/srv/bacula/volume/web-scavenger), all previous
> data lost.
> 12-Aug 14:18 exa-bacula-dir JobId 5514: Max Volume jobs exceeded.
> Marking Volume "scavenger-full-1250" as Used.
> 12-Aug 15:49 exa-bacula-sd JobId 5514: Job write elapsed time =
> 01:31:41, Transfer rate = 401.4 K bytes/second
> 12-Aug 16:18 exa-bacula-dir JobId 5514: Fatal error: Network error with
> FD during Backup: ERR=Connection reset by peer
> 12-Aug 16:18 exa-bacula-dir JobId 5514: Fatal error: No Job status
> returned from FD.
> 12-Aug 16:18 exa-bacula-dir JobId 5514: Error: Bacula exa-bacula-dir
> 2.4.4 (28Dec08): 12-Aug-2009 16:18:09
> Build OS: x86_64-pc-linux-gnu debian lenny/sid
> JobId: 5514
> Job: backup_scavenger.2009-08-12_14.18.06.04
> Backup Level: Full
> Client: "scavenger" 2.4.4 (28Dec08)
> i486-pc-linux-gnu,debian,5.0
> FileSet: "full-scavenger" 2009-04-16 15:58:05
> Pool: "scavenger-full" (From Job FullPool override)
> Storage: "FileScavenger" (From Job resource)
> Scheduled time: 12-Aug-2009 14:18:03
> Start time: 12-Aug-2009 14:18:09
> End time: 12-Aug-2009 16:18:09
> Elapsed time: 2 hours
> Priority: 10
> FD Files Written: 0
> SD Files Written: 81,883
> FD Bytes Written: 0 (0 B)
> SD Bytes Written: 2,208,578,175 (2.208 GB)
> Rate: 0.0 KB/s
> Software Compression: None
> VSS: no
> Storage Encryption: no
> Volume name(s): scavenger-full-1250
> Volume Session Id: 5
> Volume Session Time: 1250080970
> Last Volume Bytes: 2,212,857,316 (2.212 GB)
> Non-fatal FD errors: 0
> SD Errors: 0
> FD termination status: Error
> SD termination status: OK
> Termination: *** Backup Error ***
>
> ---------------------------------------------------------------------
> Postgres Log:
>
> 2009-08-12 16:18:09 BST ERROR: unexpected message type 0x58 during COPY
> from stdin
> 2009-08-12 16:18:09 BST CONTEXT: COPY batch, line 81884: ""
> 2009-08-12 16:18:09 BST STATEMENT: COPY batch FROM STDIN
> 2009-08-12 16:18:09 BST LOG: could not send data to client: Broken pipe
> 2009-08-12 16:18:09 BST LOG: could not receive data from client:
> Connection reset by peer
> 2009-08-12 16:18:09 BST LOG: unexpected EOF on client connection
>
>
>
> ------------------------------------------------------------------------------
> Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
> trial. Simplify your report design, integration and deployment - and focus on
> what you do best, core application coding. Discover what's new with
> Crystal Reports now. http://p.sf.net/sfu/bobj-july
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
>
>
--
"Without friction there's no heat, without heat there can't
be fire, without fire there's no desire, you're making me
hot-too-hot-too-hot-hot-too-hot-too-hot-OWWwwwww!" - Oingo Boingo
-------------------------------------------------------------------
John M. Lockard | U of Michigan - School of Information
Unix and Security Admin | 1214 SI North - 1075 Beal Ave.
jlockard AT umich DOT edu | Ann Arbor, MI 48109-2112
www.umich.edu/~jlockard | 734-615-8776 | 734-647-8045 FAX
-------------------------------------------------------------------
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|