Amanda-Users

Re: Backup issues with OpenBSD 4.5 machines

2009-08-21 12:39:06
Subject: Re: Backup issues with OpenBSD 4.5 machines
From: John Hein <jhein AT timing DOT com>
To: stan <stanb AT panix DOT com>
Date: Fri, 21 Aug 2009 09:57:36 -0600
stan wrote at 10:56 -0400 on Aug 21, 2009:
 > OK here is the latest on this saga :-)
 > 
 > On one of the OpenBSD 4.5 machines I have built 2.5.0p1, and was able to
 > back this machine up successfully (using classic UDP based authentication)
 > 
 > On another of them, I built 2.5.2p1. The first attempt to back this machine
 > up failed. I checked the log files, and found they were having issues
 > because /etc/amdates was missing. I corrected that, and started a 2nd
 > backup run. (Remember amcheck reports all is well with this machine). I 
 > got the following from amstatus when I attempted to back up this machine.
 > Also remember, one of the test I ran with a 2.6.1 client was to connect a
 > test machine directly to the client, using a crossover cable to eliminate
 > any firewall, or router type issues.
 > 
 > I am attaching, what I think is, the amadnad debug file associated with this
 > failure.
 > 
 > Can anyone suggest what I can do to further troubleshoot this?
 > 
 > pb48:wd0f                     1  dumper: [could not connect DATA stream:
 > can't connect stream to pb48.meadwestvaco.com port 11996: Connection
 > refused] (10:37:27)
 > 
   .
   .
   .
 > amandad: time 30.019: stream_accept: timeout after 30 seconds
 > amandad: time 30.019: security_stream_seterr(0x86b67000, can't accept new 
 > stream connection: No such file or directory)
 > amandad: time 30.019: stream 0 accept failed: unknown protocol error
 > amandad: time 30.019: security_stream_close(0x86b67000)
 > amandad: time 60.027: stream_accept: timeout after 30 seconds
 > amandad: time 60.027: security_stream_seterr(0x81212000, can't accept new 
 > stream connection: No such file or directory)
 > amandad: time 60.027: stream 1 accept failed: unknown protocol error
 > amandad: time 60.027: security_stream_close(0x81212000)
 > amandad: time 90.035: stream_accept: timeout after 30 seconds
 > amandad: time 90.036: security_stream_seterr(0x84877000, can't accept new 
 > stream connection: No such file or directory)
 > amandad: time 90.036: stream 2 accept failed: unknown protocol error
 > amandad: time 90.036: security_stream_close(0x84877000)
 > amandad: time 90.036: security_close(handle=0x81bbf800, driver=0x298a9240 
 > (BSD))
 > amandad: time 120.044: pid 17702 finish time Fri Aug 21 10:39:27 2009

For some reason the socket is not getting marked ready for read.
select(2) is timing out waiting.  Firewall setup perhaps?

This bit of code in 2.5.2p1's common-src/stream.c is where
the failure is happening for you...

int
stream_accept(
    int server_socket,
    int timeout,
    size_t sendsize,
    size_t recvsize)
{
    SELECT_ARG_TYPE readset;
    struct timeval tv;
    int nfound, connected_socket;
    int save_errno;
    int ntries = 0;
    in_port_t port;

    assert(server_socket >= 0);

    do {
        ntries++;
        memset(&tv, 0, SIZEOF(tv));
        tv.tv_sec = timeout;
        memset(&readset, 0, SIZEOF(readset));
        FD_ZERO(&readset);
        FD_SET(server_socket, &readset);
        nfound = select(server_socket+1, &readset, NULL, NULL, &tv);
        if(nfound <= 0 || !FD_ISSET(server_socket, &readset)) {
            save_errno = errno;
            if(nfound < 0) {
                dbprintf(("%s: stream_accept: select() failed: %s\n",
                      debug_prefix_time(NULL),
                      strerror(save_errno)));
            } else if(nfound == 0) {
                dbprintf(("%s: stream_accept: timeout after %d second%s\n",
                      debug_prefix_time(NULL),
                      timeout,
                      (timeout == 1) ? "" : "s"));
                errno = ENOENT;                 /* ??? */
                return -1;