I’ve been working on an issue now for several weeks
and it’s got support stumped (for the time being.)
Master: NBU 6.5.6 on Windows 2003 SP2
Clients: NBU 6.5.6. on Linux
Backups were working fine for months, then we started
getting the occasional error 233. Then one weekend we started getting boatloads
of them. Some backups are successful on 2nd attempt, others fail all
weekend long. The failures occur at random intervals. The details of the
activity monitor show connection reset by peer. The error only occurs on full
backups, not incrementals.
bpbkar on the master / media.
17:11:16.868 [14342] <4> bpbkar PrintFile: /boot/
17:11:16.868 [14342] <2> bpbkar SelectFile: INF - cwd = /boot
17:11:16.868 [14342] <2> bpbkar SelectFile: INF - path =
HP-initrd-2.6.9-78.EL.img
17:11:51.857 [14342] <16> flush_archive(): ERR - Cannot write to
STDOUT. Errno = 104: Connection reset by peer
17:11:51.857 [14342] <16> bpbkar Exit: ERR - bpbkar FATAL exit status =
24: socket write failed
17:11:51.857 [14342] <4> bpbkar Exit: INF - EXIT STATUS 24: socket write
failed
bpbkar log on client shows a similar error.
11:12:48.679 [26078] <16> bpbkar sighandler: ERR -
bpbkar killed by SIGPIPE
11:12:48.679 [26078] <2> bpbkar sighandler: INF -
ignoring additional SIGPIPE signals
11:12:48.679 [26078] <16> bpbkar Exit: ERR - bpbkar
FATAL exit status = 40: network connection broken
11:12:48.679 [26078] <4> bpbkar Exit: INF - EXIT
STATUS 40: network connection broken
11:12:48.679 [26078] <2> bpbkar Exit: INF - Close of
stdout complete
11:12:48.679 [26078] <4> bpbkar Exit: INF - setenv
FINISHED=0
We ran a network sniffer on the traffic between the
master/media and a client and everything runs fine for while before the master
sends a bunch of RSTs, killing the job. Support found a Symantec article TCP
window scaling, but we’ve verified those settings and they seem fine.
Any ideas?
TIA,
-Jonathan