Amanda-Users

More ACK Woes

2006-09-12 13:09:45
Subject: More ACK Woes
From: Steven Backus <backus AT whimsy.med.utah DOT edu>
To: amanda-users AT amanda DOT org
Date: Tue, 12 Sep 2006 09:47:39 -0600 (MDT)
Last night my 2.5.1 backups failed spectacularly:

  ambiance.med.utah.edu  sdc1                lev 0  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t11d0s0           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  ambiance.med.utah.edu  sdc1                lev 0  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  ambiance.med.utah.edu  sdc1                lev 0  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t11d0s0           lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t11d0s0           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  ambiance.med.utah.edu  sdb5                lev 0  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t1d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  ambiance.med.utah.edu  /                   lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t1d0s0            lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t1d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  ambiance.med.utah.edu  sdb5                lev 0  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  ambiance.med.utah.edu  sdb5                lev 0  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t2d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  ambiance.med.utah.edu  /                   lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  ambiance.med.utah.edu  /                   lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t2d0s0            lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t2d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t10d0s0           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t10d0s0           lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t10d0s0           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t12d0s0           lev 2  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t12d0s0           lev 2  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t12d0s0           lev 2  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t11d0s1           lev 2  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t11d0s1           lev 2  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t11d0s1           lev 2  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t10d0s1           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t0d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t0d0s0            lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t0d0s0            lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  eclectic.med.utah.edu  /opt/sybase/backup  lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  episun7.med.utah.edu   c0t10d0s1           lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  episun7.med.utah.edu   c0t10d0s1           lev 1  FAILED [cannot read header: 
got 0 instead of 32768]
  eclectic.med.utah.edu  /opt/sybase/backup  lev 1  FAILED [too many dumper 
retry: "[request failed: timeout waiting for ACK]"]
  eclectic.med.utah.edu  /opt/sybase/backup  lev 1  FAILED [cannot read header: 
got 0 instead of 32768]

I've changed common-src/protocol.c to:

#define ACK_WAIT 100 /* time (secs) to wait for ACK - keep short */
#define ACK_TRIES 10 /* num retries after ACK_WAIT timeout */

and still no joy.  I logged in last night and found 12 gtar
processes running on ambiance with the load average around 15!
Such strange behavior, I have inparallel set to 6 in my
amanda.conf, could this have something to do with it?

Steve
-- 
Steven J. Backus                        Computer Specialist
University of Utah                      E-Mail:  steven.backus AT utah DOT edu
Biomedical Informatics                  Alternate:  backus AT math.utah DOT edu
391 Chipeta Way -- Suite D150           Office:  801.587.9308
Salt Lake City, UT 84108-1266           http://www.math.utah.edu/~backus

<Prev in Thread] Current Thread [Next in Thread>