Amanda-Users

planner timeouts

2008-11-12 14:31:27
Subject: planner timeouts
From: "Krahn, Anderson" <AKrahn AT gs1us DOT org>
To: <amanda-users AT amanda DOT org>
Date: Wed, 12 Nov 2008 14:21:34 -0500

Regarding the planner timeouts…..

 

 

Here are the amdump logs on the server regarding the clients…

 

 

planner: time 1.154: got partial result for host twofish.cgx.transora.com disk /home: 0 -> -2K, -1 -> -2K, -1 -> -2K

taper: using label `ON1006L3' date `20081112130548'

driver: result time 26.700 from taper: TAPER-OK

driver: state time 26.700 free kps: 90000000 space: 524288000 taper: idle idle-dumpers: 27 qlen tapeq: 0 runq: 0 roomq: 0 wakeup: 0 driver-idle: not-idle

driver: interface-state time 26.700 if default: free 90000000

driver: hdisk-state time 26.700 hdisk 0: free 524288000 dumpers 0

planner: time 302.185: getting estimates took 302.181 secs

FAILED QUEUE:

  0: twofish.cgx.transora.com /home

DONE QUEUE: empty

 

ANALYZING ESTIMATES...

planner: FAILED twofish.cgx.transora.com /home 20081112130548 0 "[disk /home, all estimate timed out]"

INITIAL SCHEDULE (size 64):

 

DELAYING DUMPS IF NEEDED, total_size 64, tape length 703561728 mark 0

  delay: Total size now 64.

 

PROMOTING DUMPS IF NEEDED, total_lev0 0, balanced_size 0...

planner: time 302.186: analysis took 0.000 secs

 

oldlog errors:

 

STATS driver startup time 0.182

ERROR planner Request to twofish.cgx.transora.com failed: timeout waiting for REP

WARNING planner disk twofish.cgx.transora.com:/home, estimate of level 0 timed out.

FAIL planner twofish.cgx.transora.com /home 20081112130548 0 "[disk /home, all estimate timed out]"

FINISH planner date 20081112130548 time 302.186

WARNING driver WARNING: got empty schedule from planner

FINISH driver date 20081112130548 time 303.225

 

 

Client logs:

 

Run tar log:

1226516749.495366: runtar: pid 27706 ruid 30063 euid 0: start at Wed Nov 12 13:05:49 2008

1226516749.495652: runtar: version 2.6.0p2

1226516749.505371: runtar: /usr/local/bin/tar version: tar (GNU tar) 1.19

 

1226516749.505857: runtar: config: Full

1226516749.516449: runtar: pid 27706 ruid 0 euid 0: rename at Wed Nov 12 13:05:49 2008

1226516749.516894: runtar: running: /usr/local/bin/tar --create --file /dev/null --directory /home --one-file-system --listed-incremental /usr/local/var/amanda/gnutar-lists/twofish.cgx.transora.com_home_0.new --sparse --ignore-failed-read --totals --exclude-from /tmp/amanda/sendsize._home.20081112130549.exclude .

1226516749.516936: runtar: pid 27706 finish time Wed Nov 12 13:05:49 2008

 

Sendsize log:

1226516749.413525: sendsize: pid 27703 ruid 30063 euid 30063: start at Wed Nov 12 13:05:49 2008

1226516749.413851: sendsize: version 2.6.0p2

1226516749.414107: sendsize: warning: errors processing config file "/etc/amanda/amanda-client.conf" (non-fatal)

1226516749.414762: sendsize: warning: errors processing config file "/etc/amanda/Full/amanda-client.conf" (non-fatal)

1226516749.454464: sendsize: pid 27703 ruid 30063 euid 30063: rename at Wed Nov 12 13:05:49 2008

1226516749.457190: sendsize: waiting for any estimate child: 1 running

1226516749.458632: sendsize: calculating for amname /home, dirname /home, spindle -1

1226516749.458774: sendsize: getting size via gnutar for /home level 0

1226516749.461231: sendsize: Spawning "/opt/amanda/client/libexec/amanda/runtar runtar Full /usr/local/bin/tar --create --file /dev/null --directory /home --one-file-system --listed-incremental /usr/local/var/amanda/gnutar-lists/twofish.cgx.transora.com_home_0.new --sparse --ignore-failed-read --totals --exclude-from /tmp/amanda/sendsize._home.20081112130549.exclude ." in pipeline

 

amandad.20081112130548.debug log:

 

1226516749.378493: amandad: creating new service: sendsize

OPTIONS features=ffffffff9ffeffffffff00;maxdumps=8;hostname=twofish.cgx.transora.com;config=Full;

GNUTAR /home  0 1970:1:1:0:0:0 -1  OPTIONS |;auth=BSD;index;exclude-list=/home/amanda/exclude_list;

 

1226516749.381505: amandad: sending ACK pkt:

<<<<< 

>>>>> 

1226516749.381671: amandad: dgram_send_addr(addr=26180, dgram=ff350c3c)

1226516749.381692: amandad: (sockaddr_in *)26180 = { 2, 1018, 10.1.1.104 }

1226516749.381708: amandad: dgram_send_addr: ff350c3c->socket = 0

1226516749.414819: amandad: sending PREP pkt:

<<<<< 

OPTIONS features=ffffffff9ffeffffffff00;

>>>>> 

1226516749.414857: amandad: dgram_send_addr(addr=26180, dgram=ff350c3c)

1226516749.414874: amandad: (sockaddr_in *)26180 = { 2, 1018, 10.1.1.104 }

1226516749.414912: amandad: dgram_send_addr: ff350c3c->socket = 0

1226516850.419333: amandad: dgram_recv(dgram=ff350c3c, timeout=0, fromaddr=ff360c28)

1226516850.419470: amandad: (sockaddr_in *)ff360c28 = { 2, 1018, 10.1.1.104 }

1226516850.420038: amandad: received REQ pkt:

<<<<< 

SERVICE sendsize

OPTIONS features=ffffffff9ffeffffffff00;maxdumps=8;hostname=twofish.cgx.transora.com;config=Full;

GNUTAR /home  0 1970:1:1:0:0:0 -1  OPTIONS |;auth=BSD;index;exclude-list=/home/amanda/exclude_list;

>>>>> 

1226516850.420071: amandad: received dup P_REQ packet, ACKing it

1226516850.420086: amandad: sending ACK pkt:

<<<<< 

>>>>> 

1226516850.420108: amandad: dgram_send_addr(addr=26180, dgram=ff350c3c)

1226516850.420122: amandad: (sockaddr_in *)26180 = { 2, 1018, 10.1.1.104 }

1226516850.420135: amandad: dgram_send_addr: ff350c3c->socket = 0

1226516950.427759: amandad: dgram_recv(dgram=ff350c3c, timeout=0, fromaddr=ff360c28)

1226516950.427881: amandad: (sockaddr_in *)ff360c28 = { 2, 1018, 10.1.1.104 }

1226516950.428320: amandad: received REQ pkt:

<<<<< 

SERVICE sendsize

OPTIONS features=ffffffff9ffeffffffff00;maxdumps=8;hostname=twofish.cgx.transora.com;config=Full;

GNUTAR /home  0 1970:1:1:0:0:0 -1  OPTIONS |;auth=BSD;index;exclude-list=/home/amanda/exclude_list;

>>>>> 

1226516950.428345: amandad: received dup P_REQ packet, ACKing it

1226516950.428361: amandad: sending ACK pkt:

<<<<< 

>>>>> 

1226516950.428383: amandad: dgram_send_addr(addr=26180, dgram=ff350c3c)

1226516950.428397: amandad: (sockaddr_in *)26180 = { 2, 1018, 10.1.1.104 }

1226516950.428410: amandad: dgram_send_addr: ff350c3c->socket = 0

 

This is only occurring on 2 of my clients I have removed all other clients and currently just testing one client with the behavior.

Increased the etimeout value but that didn’t seem to help.

 

Anderson

 

 

 

 

<Prev in Thread] Current Thread [Next in Thread>
  • planner timeouts, Krahn, Anderson <=