> /-- watertown. /home/watertown lev 0 STRANGE
> sendbackup: start [watertown.ott.precidia.com:/home/watertown level 0]
> sendbackup: info BACKUP=/usr/bin/tar
> sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/bin/tar -f... -
> sendbackup: info COMPRESS_SUFFIX=.gz
> sendbackup: info end
> | Total bytes written: 20480 (20kB, ?B/s)
> ?
> ? gzip: stdout: Cannot send after transport endpoint shutdown
> ??error [compress returned 1]? dumper: strange [missing size line from
> sendbackup]
> ? dumper: strange [missing end line from sendbackup]
> \--------
>
> The same error message is on the client (naturally). So, it looks like gzip
> is still trying to send to the socket when sendbackup is closing the socket.
>
> My guess is that the path was closed after it had been dup'd and so an
> extra close wouldn't break things but a shutdown would. I'm going to try
> to figure out if that is the case.
So... I went in to sendbackup-gnutar.c and #if'd out some lines:
/* close the write ends of the pipes */
aclose(dumpin);
aclose(dumpout);
#if 0 // moved to sendbackup.c
aclose(dataf);
aclose(mesgf);
if (options->createindex)
aclose(indexf);
#endif
Then I went in to sendbackup.c and added the same there:
program->start_backup(g_options->hostname, disk, amdevice, level, dumpdate,
dataf, mesgpipe[1], indexf);
parse_backup_messages(mesgpipe[0]);
#if 1 // moved from sendbackup-*.c
dbprintf(("%s: closing paths
(%d,%d,%d)\n",debug_prefix(NULL),indexf,mesgf,dataf));
shutdown(dataf,2);
aclose(dataf);
shutdown(mesgf,2);
aclose(mesgf);
if (!interactive && options->createindex) {
shutdown(indexf,2);
aclose(indexf);
}
#endif
This way, there is no longer a local "aclose(dataf)" and "aclose(mesgf)"
command while there are external processes still writing to that socket.
Things have improved, but still don't look correct. Here's the message
from the server:
/-- watertown. /home/watertown lev 1 FAILED [data timeout]
sendbackup: start [watertown.ott.precidia.com:/home/watertown level 1]
sendbackup: info BACKUP=/usr/bin/tar
sendbackup: info RECOVER_CMD=/usr/bin/gzip -dc |/usr/bin/tar -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
| Total bytes written: 10240 (10kB, ?B/s)
\--------
Unfortunately, I didn't have tcpdump running to learn more about exactly
what happened. Tonight, I guess. What perplexes me, though, is that
the socket shutdown code never seemed to be run becuase there was nothing
in the sendbackup log with the "closing paths (%d,%d,%d)" message I added:
sendbackup: debug 1 pid 1332 ruid 0 euid 0: start at Wed Apr 2 20:23:45 2003
/usr/local/lib/amanda/sendbackup: version 2.4.3
parsed request as: program `GNUTAR'
disk `/home/watertown'
device `/home/watertown'
level 1
since 2003:4:2:1:20:34
options
`|;bsd-auth;compress-fast;index;exclude-file=^(#|Backup.of)|.(bak|tmp|fla|fst|hex|lib|lst|m
51|wav|mp3|mpg|mpeg|avi|mov|obj|omf|prn|swp)$|~$;'
sendbackup: try_socksize: send buffer size is 65536
sendbackup: time 0.015: stream_server: waiting for connection: 0.0.0.0.597
sendbackup: time 0.016: stream_server: waiting for connection: 0.0.0.0.598
sendbackup: time 0.017: stream_server: waiting for connection: 0.0.0.0.599
sendbackup: time 0.018: waiting for connect on 597, then 598, then 599
sendbackup: got data path #0
sendbackup: time 0.035: stream_accept: connection from 10.0.1.2.3442
sendbackup: time 0.039: stream_accept: connection from 10.0.1.2.3443
sendbackup: got mesg path #7
sendbackup: time 0.043: stream_accept: connection from 10.0.1.2.3444
sendbackup: time 0.043: got all connections
sendbackup: time 0.044: spawning /usr/bin/gzip in pipeline
sendbackup: argument list: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.094: pid 3564: /usr/bin/gzip --fast
sendbackup-gnutar: time 0.100: doing level 1 dump as listed-incremental from
/var/lib/amanda/gnutar-lists/watertown.ott.
precidia.com_home_watertown_0 to
/var/lib/amanda/gnutar-lists/watertown.ott.precidia.com_home_watertown_1.new
sendbackup-gnutar: time 0.114: doing level 1 dump from date: 2003-04-01
1:25:14 GMT
sendbackup: time 0.199: spawning /usr/local/lib/amanda/runtar in pipeline
sendbackup: argument list: gtar --create --file - --directory /home/watertown
--listed-incremental /var/lib/amanda/gnuta
r-lists/watertown.ott.precidia.com_home_watertown_1.new --sparse
--ignore-failed-read --totals --exclude-from /tmp/amand
a/sendbackup._home_watertown.20030402202345.exclude .
sendbackup: time 0.290: started index creator: "/usr/bin/tar -tf - 2>/dev/null
| sed -e 's/^\.//'"
sendbackup-gnutar: time 0.296: /usr/local/lib/amanda/runtar: pid 3604
sendbackup: time 0.693: 53: size(|): Total bytes written: 10240 (10kB, ?B/s)
sendbackup: time 0.826: index created successfully
Why was that message not there? Did the program exit? The only thing between
that log and the preceeding "start_backup" call is a call to the function
"parse_backup_messages" and I can't see how that can exit without adding a
message in the log file.
Brian
( bcwhite AT precidia DOT com )
-------------------------------------------------------------------------------
No man dies except he who has not lived.
|