On Thursday 03 Dec 2009 17:25:31 Kameleon wrote:
> Update:
>
> We moved the backuppc server to the same room as the client server and on
> the same switch. The backup still failed, but got alot further. The error
> is the same as last time:
>
> Read EOF:
> Tried again: got 0 bytes
> Child is aborting
> Can't write 33792 bytes to socket
> Sending csums, cnt = 250243, phase = 1
> Done: 26416 files, 31698685174 bytes
> Got fatal error during xfer (aborted by signal=PIPE)
> Backup aborted by user signal
> Saving this as a partial backup
>
>
> So I ran the exact command as the backuppc user that it uses according to
> the log file and ran it manually. Then I straced the pid on the client
> machine and got this:
>
> select(1, [0], [], NULL, {45, 57000}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 0 (Timeout)
> select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {29, 170000})
> read(0, "", 4) = 0
> write(2, "rsync: connection unexpectedly c"..., 72) = -1 EPIPE (Broken
> pipe) --- SIGPIPE (Broken pipe) @ 0 (0) ---
> rt_sigaction(SIGUSR1, {0x1, [], 0}, NULL, 8) = 0
> rt_sigaction(SIGUSR2, {0x1, [], 0}, NULL, 8) = 0
> write(2, "rsync error: errors with program"..., 83) = -1 EPIPE (Broken
> pipe) --- SIGPIPE (Broken pipe) @ 0 (0) ---
> rt_sigaction(SIGUSR1, {0x1, [], 0}, NULL, 8) = 0
> rt_sigaction(SIGUSR2, {0x1, [], 0}, NULL, 8) = 0
> gettimeofday({1259860374, 861962}, NULL) = 0
> select(0, NULL, NULL, NULL, {0, 100000}) = 0 (Timeout)
> gettimeofday({1259860374, 961981}, NULL) = 0
> exit_group(13) = ?
> Process 13532 detached
>
> It still appears the problem is on the remote server since it is exiting.
> The client server is a Dell poweredge so I would hope it wasn't hardware
> related. Anything else I can check before I give it a swift kick in the
> pants?
>
> Donny B.
>
> On Wed, Dec 2, 2009 at 3:56 PM, Kameleon <
kameleon25 AT gmail DOT com> wrote:
> > I did some more testing watching top and such on both backuppc server and
> > the client. Both had plenty of memory and such. The thing we are thinking
> > is possibly the cable or switch between the two. Tomorrow we plan to
> > relocate the backuppc server from it's current location and plug it
> > directly in via crossover cable to the server. That will eliminate the
> > networking gear all except the network intercafe cards.
> >
> > Thanks for the input.
> >
> > On Wed, Dec 2, 2009 at 3:10 PM, Les Mikesell <
lesmikesell AT gmail DOT com>wrote:
> >> Kameleon wrote:
> >> > I do apologize. The backuppc server is Ubuntu 9.10 and the server
> >> > being backed up is Centos 5.4. I have changed everything back to rsync
> >> > and tried a manual full backup (since that is what it was attempting
> >> > to do when it failed) I ran strace on the PID of rsync on the remote
> >> > server being backed up. The last few lines of the output are below.
> >>
> >> [...]
> >>
> >> > read(0,
> >>
> >> "\256\374O\362\350\224\30\3101(Y\"8\3279z\300nt\10*\367\26+\355\364\245W
> >>)/\224\301"...,
> >>
> >> > 8184) = 8184
> >> > select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {60, 0})
> >> > read(0,
> >>
> >> "\314\326\4\242P\345\3\332\245b\317\363\4\253'\307\3056Y\307X\313\364I\5
> >>\3746\fH\340\212w"...,
> >>
> >> > 8184) = 1056
> >> > select(1, [0], [], NULL, {60, 0}) = 1 (in [0], left {58,
> >> > 296000}) read(0, "", 8184) = 0
> >> > select(2, NULL, [1], [1], {60, 0}) = 1 (out [1], left {60, 0})
> >> > write(1, "O\0\0\10rsync: connection unexpected"..., 83) = -1 EPIPE
> >> > (Broken pipe)
> >>
> >> Looks like something is wrong on the target side, dropping the
> >> connection. File system problems? Out of memory? Are both machines on
> >> the same LAN or could there be a problem with networking equipment
> >> between them?
> >>
> >> > Backuppc shows the following error when it fails:
> >> >
> >> > 2009-12-02 14:17:58 full backup started for directory /; updating
> >>
> >> partial #4
> >>
> >> > 2009-12-02 14:24:59 Aborting backup up after signal PIPE
> >> > 2009-12-02 14:25:00 Got fatal error during xfer (aborted by
> >> > signal=PIPE)
> >>
> >> This doesn't tell you anything except that the other end died.
> >>
> >> > remote machine: rsync version 3.0.6 protocol version 30
> >> > backuppc: rsync version 3.0.6 protocol version 30
> >>
> >> Backuppc doesn't use the rsync binary on the server side - it has its
> >> own implementation in perl. But it looks like things started OK and
> >> then either the remote side quite or the network connection had a
> >> problem.
> >>
> >>
> >> --
> >> Les Mikesell
> >>
lesmikesell AT gmail DOT com
> >>
> >>
> >> ------------------------------------------------------------------------
> >>------ Join us December 9, 2009 for the Red Hat Virtual Experience,
> >> a free event focused on virtualization and cloud computing.
> >> Attend in-depth sessions from your desk. Your couch. Anywhere.
> >>
http://p.sf.net/sfu/redhat-sfdev2dev
> >> _______________________________________________
> >> BackupPC-users mailing list
> >>
BackupPC-users AT lists.sourceforge DOT net
> >> List:
https://lists.sourceforge.net/lists/listinfo/backuppc-users
> >> Wiki:
http://backuppc.wiki.sourceforge.net
> >> Project:
http://backuppc.sourceforge.net/
>