Re: 2.5.1p2: planner returns error=EOF on read
2006-11-26 18:17:28
Amanda expect tar to exit with a "broken pipe".
I looks that we need to kill it.
Jean-Louis
Jean-Francois Malouin wrote:
* Jean-Louis Martineau <martineau AT zmanda DOT com> [20061123 17:20]:
Do amandad is running?
To be sure that I would hit the problem again I didn't load new tapes
so after the holddding filled up I have a bunch of gtar running and
just wasting a lot of cycles. A typical example (sorry for the long lines):
amanda 3555936 3557855 0 - - 0:00 <defunct>
amanda 3557855 1 0 05:37:39 ? 1:30 /opt/amanda/amanda1/libexec/sendbackup amandad bsdtcp
root 3568087 3557855 0 05:40:02 ? 277:35 gtar --create --file -
--directory /data/mafalda/mafalda1/susanita/sandra/DTI -
Tracing the sendbackup process yields no output while the gtar one
gives just an uninterupted flow of lines:
68mS tar(3568087): write(1, <fd 63 e3 6c ff 42 3c df 87
41 82 bf e9 dd 95 2c>..., 10240) errno = 32 (Broken pipe)
Don't know if this is going to be of any help...
jf
Could you try to find what each process (amanda,sendbackup,tar) are
doing? on which system call they are hung?
On linux, I would use strace for that.
Jean-Louis
Jean-Francois Malouin wrote:
Jean-Louis,
The amdump just finished and because amanda ran out of tapes
(runtapes=10) it completed badly leaving stuff in the holdding
disk. I'm flushing it at the moment but I noticed that when amanda
gave up it left a lot of gnutar lying around, with amandad as parent.
I had to manually kill them. Looks like signals are not doing ok.
I've attached one example of a DLE's sendbackup and runtar debug file.
It's not the first time I notice that when a DLE fails to make it
to tape successfully processes are left running...
* Jean-Louis Martineau <martineau AT zmanda DOT com> [20061122 09:32]:
Could you post amandad.<timestamp>.debug file from yorick?
Jean-Francois Malouin wrote:
On a server running irix-6.5 and amanda 2.5.1p2
planner debug shows:
security_getdriver(name=bsdtcp) returns 4075aa8
security_handleinit(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_streaminit(stream=1005e538, driver=4075aa8 (BSDTCP))
security_close(handle=1001cfe0, driver=4075aa8 (BSDTCP))
security_stream_close(10053a80)
security_stream_seterr(1005e538, SOCKET_EOF)
security_seterror(handle=1001d4a8, driver=4075aa8 (BSDTCP) error=EOF
on read from yorick)
security_close(handle=1001d4a8, driver=4075aa8 (BSDTCP))
security_stream_close(1005e538)
and the amanda report shows things like:
yorick DATA_sub101 lev 0 FAILED [hmm, disk was stranded on waitq]
planner: ERROR Request to yorick failed: EOF on read from yorick
and the backup fails. The funny thing is that I have 2 other
configurations
running 2.5.1p2 in parallel that doesn't exhibit this behaviour.
Any clues?
jf
------------------------------------------------------------------------
runtar: debug 1 pid 2826413 ruid 666 euid 0: start at Thu Nov 23 08:04:13
2006
runtar: version 2.5.1p2
/usr/freeware/bin/tar version: tar (GNU tar) 1.13.25
config: stk_80-conf1
runtar: debug 1 pid 2826413 ruid 0 euid 0: rename at Thu Nov 23 08:04:13
2006
running: /usr/freeware/bin/tar: 'gtar' '--create' '--file' '-'
'--directory'
'/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
'--one-file-system' '--listed-incremental'
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
'--sparse' '--ignore-failed-read' '--totals' '.' runtar: pid 2826413
finish time Thu Nov 23 08:04:13 2006
------------------------------------------------------------------------
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: start at Thu Nov 23
08:03:20 2006
sendbackup: version 2.5.1p2
Could not open conf file
"/opt/amanda/amanda1/etc/amanda/amanda-client.conf": No such file or
directory
Reading conf file
"/opt/amanda/amanda1/etc/amanda/stk_80-conf1/amanda-client.conf".
sendbackup: debug 1 pid 2836263 ruid 666 euid 666: rename at Thu Nov 23
08:03:20 2006
sendbackup req: <GNUTAR sub101
/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101 1
2006:11:13:13:21:32 OPTIONS |;auth=bsdtcp;index;>
parsed request as: program `GNUTAR'
disk `sub101'
device
`/data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101'
level 1
since 2006:11:13:13:21:32
options `|;auth=bsdtcp;index;'
sendbackup: start: yorick:sub101 lev 1
sendbackup-gnutar: time 0.208: doing level 1 dump as listed-incremental
>from '/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_0' to
'/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new'
sendbackup-gnutar: time 53.057: doing level 1 dump from date: 2006-11-13
13:22:26 GMT
sendbackup: time 53.246: started index creator: "/usr/freeware/bin/tar -tf
- 2>/dev/null | sed -e 's/^\.//'"
sendbackup: time 53.254: spawning /opt/amanda/amanda1/libexec/runtar in
pipeline
sendbackup: argument list: runtar stk_80-conf1 gtar --create --file -
--directory /data/mafalda/mafalda1/susanita/jen/anxiety_version1/sub101
--one-file-system --listed-incremental
/opt/amanda/amanda1/var/amanda/gnutar-lists/yoricksub101_1.new --sparse
--ignore-failed-read --totals .
sendbackup-gnutar: time 53.292: /opt/amanda/amanda1/libexec/runtar: pid
2826413
sendbackup: time 53.292: started backup
sendbackup: time 1839.833: index tee cannot write [Broken pipe]
sendbackup: time 1839.833: pid 2836285 finish time Thu Nov 23 08:34:00 2006
sendbackup: time 1839.926: 118: strange(?): sendbackup: index tee cannot
write [Broken pipe]
sendbackup: time 9284.787: error [/usr/freeware/bin/tar got signal 15]
sendbackup: time 9284.856: pid 2836263 finish time Thu Nov 23 10:38:05 2006
|
|
|