Bacula-users

[Bacula-users] Locating a network connection failure

2009-10-01 10:55:59
Subject: [Bacula-users] Locating a network connection failure
From: baculalist AT encambio DOT com
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 1 Oct 2009 14:23:47 +0200
Hello list,

Problem 1 (See director log below):
Network error 'Connection reset by peer'.

Solution 1 (as documented):
Using the 'Heartbeat Interval' configuration parameter throughout
the director, storage daemon, and file daemon config files.

Problem 2:
Setting keepalives (Heartbeat Interval) everywhere is not good.
Exactly in which block of which config keepalives are needed is
not easy to understand.

Question 2:
...Please advise in which block of which config 'Heartbeat Interval
is really required to solve the network failure. Thanks!

My guess 2:
I'm guessing that the director->filedaemon connection is breaking
off, causing the network failure. If that's true, I assume that
setting 'Heartbeat Interval' only in the 'Director' block of
bacula-dir.conf will solve the problem. Can I really remove
all mention of 'Heartbeat Interval' in the other configs,
including the 'Storage' block of bacula-dir.conf?

Unfortunately, I can't test my guess because of traffic limits.

bacula-dir.log:
  30-Sep 20:06 host1-sd JobId 8: Wrote label to prelabeled Volume "bacvol-01" 
on device "FileStorage" (/backups/bacula)
  30-Sep 22:06 host1-dir JobId 8: Fatal error: Network error with FD during 
Backup: ERR=Connection reset by peer
  30-Sep 22:06 host1-dir JobId 8: Fatal error: No Job status returned from FD.
  30-Sep 22:06 host1-dir JobId 8: Error: Bacula host1-dir 3.0.2 (18Jul09):
  30-Sep-2009 22:06:31
    Build OS:               x86_64-unknown-linux-gnu ubuntu 8.04
    JobId:                  8
    Job:                    Host2.2009-09-30_20.06.14_03
    Backup Level:           Full (upgraded from Incremental)
    Client:                 "host2-fd" 3.0.2 (18Jul09) 
i386-pc-solaris2.11,solaris,5.11
    FileSet:                "ProductionSet" 2009-09-29 17:31:52
    Pool:                   "LongtermPool" (From Job resource)
    Catalog:                "MyCatalog" (From Client resource)
    Storage:                "TunnelStore" (From Job resource)
    Scheduled time:         30-Sep-2009 20:06:12
    Start time:             30-Sep-2009 20:06:24
    End time:               30-Sep-2009 22:06:31
    Elapsed time:           2 hours 7 secs
    Priority:               50
    FD Files Written:       0
    SD Files Written:       0
    FD Bytes Written:       0 (0 B)
    SD Bytes Written:       0 (0 B)
    Rate:                   0.0 KB/s
    Software Compression:   None
    VSS:                    no
    Encryption:             no
    Accurate:               no
    Volume name(s):         bacvol-01
    Volume Session Id:      1
    Volume Session Time:    1254333910
    Last Volume Bytes:      4,999,679,774 (4.999 GB)
    Non-fatal FD errors:    0
    SD Errors:              0
    FD termination status:  Error
    SD termination status:  Running
    Termination:            *** Backup Error ***

Regards,
Eduard

------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • [Bacula-users] Locating a network connection failure, baculalist <=