On Fri, Aug 21, 2009 at 09:57:36AM -0600, John Hein wrote:
> stan wrote at 10:56 -0400 on Aug 21, 2009:
> > OK here is the latest on this saga :-)
> >
> > On one of the OpenBSD 4.5 machines I have built 2.5.0p1, and was able to
> > back this machine up successfully (using classic UDP based authentication)
> >
> > On another of them, I built 2.5.2p1. The first attempt to back this machine
> > up failed. I checked the log files, and found they were having issues
> > because /etc/amdates was missing. I corrected that, and started a 2nd
> > backup run. (Remember amcheck reports all is well with this machine). I
> > got the following from amstatus when I attempted to back up this machine.
> > Also remember, one of the test I ran with a 2.6.1 client was to connect a
> > test machine directly to the client, using a crossover cable to eliminate
> > any firewall, or router type issues.
> >
> > I am attaching, what I think is, the amadnad debug file associated with
> this
> > failure.
> >
> > Can anyone suggest what I can do to further troubleshoot this?
> >
> > pb48:wd0f 1 dumper: [could not connect DATA stream:
> > can't connect stream to pb48.meadwestvaco.com port 11996: Connection
> > refused] (10:37:27)
> >
> .
> .
> .
> > amandad: time 30.019: stream_accept: timeout after 30 seconds
> > amandad: time 30.019: security_stream_seterr(0x86b67000, can't accept new
> stream connection: No such file or directory)
> > amandad: time 30.019: stream 0 accept failed: unknown protocol error
> > amandad: time 30.019: security_stream_close(0x86b67000)
> > amandad: time 60.027: stream_accept: timeout after 30 seconds
> > amandad: time 60.027: security_stream_seterr(0x81212000, can't accept new
> stream connection: No such file or directory)
> > amandad: time 60.027: stream 1 accept failed: unknown protocol error
> > amandad: time 60.027: security_stream_close(0x81212000)
> > amandad: time 90.035: stream_accept: timeout after 30 seconds
> > amandad: time 90.036: security_stream_seterr(0x84877000, can't accept new
> stream connection: No such file or directory)
> > amandad: time 90.036: stream 2 accept failed: unknown protocol error
> > amandad: time 90.036: security_stream_close(0x84877000)
> > amandad: time 90.036: security_close(handle=0x81bbf800, driver=0x298a9240
> (BSD))
> > amandad: time 120.044: pid 17702 finish time Fri Aug 21 10:39:27 2009
>
> For some reason the socket is not getting marked ready for read.
> select(2) is timing out waiting. Firewall setup perhaps?
>
> This bit of code in 2.5.2p1's common-src/stream.c is where
> the failure is happening for you...
>
> int
> stream_accept(
> int server_socket,
> int timeout,
> size_t sendsize,
> size_t recvsize)
> {
> SELECT_ARG_TYPE readset;
> struct timeval tv;
> int nfound, connected_socket;
> int save_errno;
> int ntries = 0;
> in_port_t port;
>
> assert(server_socket >= 0);
>
> do {
> ntries++;
> memset(&tv, 0, SIZEOF(tv));
> tv.tv_sec = timeout;
> memset(&readset, 0, SIZEOF(readset));
> FD_ZERO(&readset);
> FD_SET(server_socket, &readset);
> nfound = select(server_socket+1, &readset, NULL, NULL, &tv);
> if(nfound <= 0 || !FD_ISSET(server_socket, &readset)) {
> save_errno = errno;
> if(nfound < 0) {
> dbprintf(("%s: stream_accept: select() failed: %s\n",
> debug_prefix_time(NULL),
> strerror(save_errno)));
> } else if(nfound == 0) {
> dbprintf(("%s: stream_accept: timeout after %d second%s\n",
> debug_prefix_time(NULL),
> timeout,
> (timeout == 1) ? "" : "s"));
> errno = ENOENT; /* ??? */
> return -1;
The firts thing I notice when comparing this function in 2.5.0 vs 2.5.2 is
that 2.5.0 does:
tv.tv_usec = 0;
and 2.5.2 does not. Could thim make a difference? Both do
tv.tv_sec = timeout;
--
One of the main causes of the fall of the roman empire was that, lacking
zero, they had no way to indicate successful termination of their C
programs.
|