BackupPC-users

Re: [BackupPC-users] BackupPC fails with aborted by signal=PIPE, Can't write to socket

2017-01-19 23:31:37
Subject: Re: [BackupPC-users] BackupPC fails with aborted by signal=PIPE, Can't write to socket
From: Les Mikesell <lesmikesell AT gmail DOT com>
To: "General list for user discussion, questions and support" <backuppc-users AT lists.sourceforge DOT net>
Date: Thu, 19 Jan 2017 22:31:03 -0600
On Thu, Jan 19, 2017 at 9:57 PM, John Spitzer <johned9999 AT comcast DOT net> 
wrote:
>>
>> The most likely suspect is that rsync timeout shown in the log snippet
>> you posted.  But you didn't provide any details about why or how your
>> rsync had timeouts enabled.
>>
> That rsync timeout is being set 'under the hood'. I can't tell from the logs
> I've captured already if it's set by BackupPC or the NAS

How is rsync started at the other end?  Is it a standalone rsync
daemon with backuppc using the rsyncd xfer method or is it the rsync
xfer method running over ssh?   If it is a standalone daemon there
should be a startup file with the parameters on the other host.   If
it is started via ssh the options should be visible in the backuppc
settings (and I don't think the defaults have any timeout set).

> or if the timeout
> is occurring for the writes to the pipe between the client/server processes
> or server and network driver to the NAS. I suspect the latter because of the
> following data extracted from strace.

There aren't any timeouts on pipes.  You get the SIGPIPE when the
thing on the other end exits or a network socket is closed.

> This data was taken for a run that
> failed. strace was attached to the two BackupPC_dump processes that were
> running. You can see that the reads were occurring from the NAS share
> (/media/Backup/BackupPC/pc/...) and a write process
> [write(9<socket:[258436]>,] to a socket received the EPIPE signal. There's
> also an alarm(72000) that may be significant.

BackupPC does have its own alarm timer, but you'd get a SIGALARM from
that, not a SIGPIPE.

> 5677    12:37:51.243532
> read(6</media/Backup/BackupPC/pc/johned-linux-vbox/5/fDrive1/f.virtualbox/fMachines/fWin
> 10 Pro x64 VM07/fWin 10 Pro x64 VM07-disk1.vdi>,  <unfinished ...>
> 5671    12:37:51.244472    alarm(72000)      = 72000
> 5671    12:37:51.244625    select(16, [7<socket:[259570]>],
> [9<socket:[258436]>], [7<socket:[259570]> 9<socket:[258436]>], NULL) = 1
> (out [9])
> 5671    12:37:51.244804    write(9<socket:[258436]>,
> "\330\24\352\332U\367\0214?\213=tP
> \v6\271TV\31z\342X\271\220/\326\"x\355\345\37"..., 35328) = -1 EPIPE (Broken
> pipe)
> 5671    12:37:51.244912    --- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER,
> si_pid=5671, si_uid=122} ---
> 5671    12:37:51.244946    rt_sigreturn()    = -1 EPIPE (Broken pipe)
> 5
> ------
> I've attached a file with more of the trace. I have a much longer file with
> the last 10 minutes of the run before the failure. The rsync process always
> fails two or three seconds longer than 300 seconds. I was hoping that
> someone with knowledge of the internals of BackupPC could chime in and shed
> some light on what's happening and why.

I don't think it is BackupPC - other than being slow and inefficient
for what you want it to do.  That is, I can understand it not having
activity for 300+ seconds while it is working with your huge file, but
that alone should not be a problem.  I think the problem is in the
options set for rsync at the other end that make it exit.

-- 
    Les Mikesell
     lesmikesell AT gmail DOT com

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, SlashDot.org! http://sdm.link/slashdot
_______________________________________________
BackupPC-users mailing list
BackupPC-users AT lists.sourceforge DOT net
List:    https://lists.sourceforge.net/lists/listinfo/backuppc-users
Wiki:    http://backuppc.wiki.sourceforge.net
Project: http://backuppc.sourceforge.net/