Amanda-Users

Re: Backup issues with OpenBSD 4.5 machines

2009-09-28 14:43:27
Subject: Re: Backup issues with OpenBSD 4.5 machines
From: "Dustin J. Mitchell" <dustin AT zmanda DOT com>
To: Michael Burk <burkml AT gmail DOT com>
Date: Mon, 28 Sep 2009 13:48:37 -0400
OK, I have a more in-depth summary of exactly what's going on here,
and why the fcntl() calls fix it.  The good news: we've stumbled on a
pretty stable "fix" for this problem.

As background, the Amanda client operates something like this:

amandad is invoked by (x)inetd or some other mechanism
amandad uses the "Amanda protocol" to determine which service the
server is requesting (sendbackup in this case)
amandad forks and executes that service, after setting up a bunch of
pipes for it.

The unusual thing is that, aside from the usual stdin/stdout/stderr
(fd's 0-2), amandad sets up six pipes at hard-wired file descriptors
50-55, and sendbackup uses those to send the data, index, and message
streams back to the server.

As background on POSIX:

Multiple processes can hold the same file open at the same time.  This
is how, for example, a backgrounded process in the shell can "share"
your terminal with the shell itself.  Each of those processes would
like to access the file either in blocking mode (waiting for data to
be available) or nonblocking mode (immediately returning when no data
is available, to allow the application to work on something else while
waiting).  Unfortunately, POSIX specifies that the file *itself*
carries the O_NONBLOCK flag, so it is not specific to the application.
 In the case at hand, Amanda is accessing a particular pipe in
nonblocking mode (for reasons explained below), while gzip expects it
to be in blocking mode, and this leads to the EAGAIN that is killing
gzip.

As background on the OpenBSD pthreads (or, more accurately, uthreads
-- lib/libpthread/uthread):

This library shims its way between an application and the kernel, and
implements blocking threaded operations on file descriptors using
nonblocking kernel operations and a select() loop.  In order to do so,
it must set O_NONBLOCK on every file it accesses.  This is easily
accomplished by wrapping open(), pipe(), dup(), dup2(), socket(), and
so on -- the syscalls which create new file descriptors.  However,
"inherited" file descriptors -- those opened by the parent before
calling execve() -- are a little bit harder, because the library has
no way to know about them.  It hides this O_NONBLOCK flag from the
application by masking it out of the fcntl() return value.

The solution that uthreads uses is to start tracking a file the first
time it is referenced in a syscall (in _thread_fd_lock, to be
precise).  It sets O_NONBLOCK when it starts tracking the file, and
then removes the flag at the appropriate time (execve, in particular).

In the failure mode, uthreads finds out about the index file *after*
it has forked the gzip child, when sendbackup tries to close the file
descriptor.  Due to the design of the library, it carefully sets
O_NONBLOCK before closing the file descriptor, leading gzip to get an
EAGAIN error.

The mysterious fcntl() calls, however, serve as a warning to uthreads
that the index file exists.  Uthreads sets the O_NONBLOCK flag when
performing the fcntl(), but then clears it on execve(), so everything
works as expected.

So, there are really two fixes available, until OpenBSD's new
threading library is available:
 1. don't link Amanda client libraries with threading libraries
 2. "inform" uthreads of the high-numbered FD's in all of the service
binaries, using fcntl()

Option 1 would be temporary -- eventually, I would like to be able to
use threads on clients, to support compression and encryption, for
example.  Option 1 is also harder than it sounds -- futzing with the
build process is like playing whack-a-mole, where any change causes
problems on another platform.

>From my analysis above, option 2 is fairly robust (as robust as
OpenBSD's pthreads, anyway), and won't cause any trouble on systems
with non-buggy threading libraries.  So I'm leaning that direction.

Dustin

-- 
Open Source Storage Engineer
http://www.zmanda.com

<Prev in Thread] Current Thread [Next in Thread>