Amanda-Users

Re: Index Tees - Data Timeouts

2002-08-14 11:58:55
Subject: Re: Index Tees - Data Timeouts
From: Jim Summers <jsummers AT bachman.cs.ou DOT edu>
To: amanda-users <amanda-users AT amanda DOT org>
Date: 14 Aug 2002 10:41:10 -0500
On Wed, 2002-08-14 at 09:44, Joshua Baker-LePain wrote:
> On 14 Aug 2002 at 9:26am, Jim Summers wrote
> 
> > On Wed, 2002-08-14 at 08:23, Joshua Baker-LePain wrote:
> > > On 14 Aug 2002 at 8:09am, Jim Summers wrote
> > > 
> > > > I am running Amanda 2.4.2p2 on a Redhat Linux 7.3 as my Amanda server. 
> > > > The clients are mostly Solaris.  I have been been backing up the server
> > > > and adding clients one at a time.  Everything was working well, one
> > > > server and two clients, then I added a third client.  Now I getting data
> > > > timeouts and index tee broken messages in my Amanda reports and in the
> > > > system log files.
> > > 
> > > >From which systems?  The actual error messages would be most helpful.
> > >From one of the working systems a Sun E250 Solaris 8 and from the newly
> > added system Sun Ultra10 Solaris 8.  I will send the amanda report when
> > I get the next one.
> 
> You said you had messages in the system log files -- what are those?  You 
Here are the messages in my system log file:

Aug 14 01:14:14 turing sendbackup[17657]: [ID 702911 auth.notice] index
tee cannot write [Broken pipe]
Aug 14 01:14:14 turing sendbackup[17655]: [ID 702911 auth.notice] error
[/usr/local/bin/tar got signal 13, compress returned 1]
Aug 14 02:06:34 turing sendbackup[17740]: [ID 702911 auth.notice] index
tee cannot write [Broken pipe]




> could also try increasing dtimeout...
I have twiddled with that one went from 1800 to 3600.  Then back to 1800
and I am currently at 2400.  

> 
> How do your dumprates look?
Here is the last amanda report received.  I incorrectly used the wrong
dump type on the /usr/oracle fs.

> 
These dumps were to tapes daily09, daily10.
The next 2 tapes Amanda expects to used are: daily11, daily12.

FAILURE AND STRANGE DUMP SUMMARY:
  tarjan     /opt lev 0 FAILED [data timeout]
  turing     /cs/turing/home2 lev 1 FAILED [data timeout]These dumps
were to tapes daily09, daily10.
The next 2 tapes Amanda expects to used are: daily11, daily12.

FAILURE AND STRANGE DUMP SUMMARY:
  tarjan     /opt lev 0 FAILED [data timeout]
  turing     /cs/turing/home2 lev 1 FAILED [data timeout]
  turing     /opt lev 0 FAILED [data timeout]
  tarjan     /usr/oracle lev 0 FAILED [data timeout]
  turing     /usr lev 0 STRANGE


STATISTICS:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    0:50
Run Time (hrs:min)        10:21
Dump Time (hrs:min)        4:26       4:08       0:17
Output Size (meg)       13542.1    11766.4     1775.7
Original Size (meg)     26361.5    24555.9     1805.6
Avg Compressed Size (%)    47.9       47.9        7.3   (level:#disks
...)
Filesystems Dumped           13          4          9   (1:9)
Avg Dump Rate (k/s)       870.0      808.9     1742.9

Tape Time (hrs:min)        3:23       2:56       0:27
Tape Size (meg)         13542.5    11766.5     1776.0
Tape Used (%)             116.7      101.4       15.3   (level:#disks
...)
Filesystems Taped            13          4          9   (1:9)
Avg Tp Write Rate (k/s)  1137.5     1141.5     1111.7


FAILED AND STRANGE DUMP DETAILS:

/-- tarjan     /opt lev 0 FAILED [data timeout]
sendbackup: start [tarjan:/opt level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? 
\--------

/-- turing     /cs/turing/home2 lev 1 FAILED [data timeout]
sendbackup: start [turing:/cs/turing/home2 level 1]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? 
\--------

/-- turing     /opt lev 0 FAILED [data timeout]
sendbackup: start [turing:/opt level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc
|/usr/sbin/ufsrestore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Wed Aug 14 01:54:45 2002
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/rdsk/c0t0d0s5 (turing:/opt) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 6949216 blocks (3393.17MB) on 0.05 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
| 
? gzip: stdout: Broken pipe
? sendbackup: index tee cannot write [Broken pipe]
|   DUMP: Broken pipe
|   DUMP: The ENTIRE dump is aborted.
? index returned 1
sendbackup: error [/usr/sbin/ufsdump returned 3, compress returned 1]
\--------

/-- tarjan     /usr/oracle lev 0 FAILED [data timeout]
sendbackup: start [tarjan:/usr/oracle level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc
|/usr/sbin/ufsrestore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Wed Aug 14 01:55:12 2002
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/rdsk/c0t0d0s3 (tarjan:/usr) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 6191034 blocks (3022.97MB) on 0.04 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
| 
? gzip: stdout: Broken pipe
? sendbackup: index tee cannot write [Broken pipe]
|   DUMP: Broken pipe
|   DUMP: The ENTIRE dump is aborted.
? index returned 1
sendbackup: error [/usr/sbin/ufsdump returned 3, compress returned 1]
\--------

/-- turing     /usr lev 0 STRANGE
sendbackup: start [turing:/usr level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./local/var/amanda/gnutar-lists/tarjan_dbfiles_1.new: Warning:
Cannot stat: No such file or directory
| Total bytes written: 7120732160 (6.6GB, 1.7MB/s)
sendbackup: size 6953840
sendbackup: end
\--------


NOTES:
  planner: Adding new disk tarjan:/usr/oracle.
  planner: Incremental of turing:/cs/turing/facstaff1 bumped to level 3.
  taper: tape daily09 kb 11904800 fm 13 writing file: No space left on
device
  taper: retrying turing:/cs/turing/facstaff1.0 on new tape: [writing
file: No space left on device]
  taper: tape daily10 kb 8512896 fm 1 [OK]


DUMP SUMMARY:
                                     DUMPER STATS            TAPER STATS
HOSTNAME     DISK        L ORIG-KB OUT-KB COMP% MMM:SS  KB/s MMM:SS 
KB/s
-------------------------- ---------------------------------
------------
bachman      /etc        1      90     32  35.6   0:01  62.0   0:08  
8.2
bachman      /var/mail   0  491100 188864  38.5   9:41 325.0  
2:471131.8
suman        /usr        0 2135250 731424  34.3   9:011352.0 
10:541119.3
tarjan       /dbfiles    1 16572481657248   --    8:103380.0 
24:141140.2
tarjan       /dblogs     1  158624 158624   --    0:423746.0  
2:251095.1
tarjan       /etc        1      90     32  35.6   0:02  21.3   0:02 
27.3
tarjan       /opt        0 FAILED
---------------------------------------
tarjan       /usr/oracle 0 FAILED
---------------------------------------
turing       -ng/dbfiles 1      32     32   --    0:00  87.3   0:02 
33.0
turing       -ing/dblogs 1      32     32   --    0:00  65.3   0:02 
32.9
turing       -/facstaff1 0 155651008512864  54.7 163:19 868.7
124:041143.6
turing       -ring/home1 1   15720   1312   8.3   7:57   2.8   0:08
163.3
turing       -ring/home2 1 FAILED
---------------------------------------
turing       /etc        1     130     32  24.6   0:01  44.0   0:13  
5.0
turing       /opt        0 FAILED
---------------------------------------
turing       /usr        0 69538402615648  37.6  66:14 658.2 
38:111141.7
turing       /var        1   16970    992   5.8   0:30  33.1   0:02
444.0

(brought to you by Amanda version 2.4.2p2)

  turing     /opt lev 0 FAILED [data timeout]
  tarjan     /usr/oracle lev 0 FAILED [data timeout]
  turing     /usr lev 0 STRANGE


STATISTICS:
                          Total       Full      Daily
                        --------   --------   --------
Estimate Time (hrs:min)    0:50
Run Time (hrs:min)        10:21
Dump Time (hrs:min)        4:26       4:08       0:17
Output Size (meg)       13542.1    11766.4     1775.7
Original Size (meg)     26361.5    24555.9     1805.6
Avg Compressed Size (%)    47.9       47.9        7.3   (level:#disks
...)
Filesystems Dumped           13          4          9   (1:9)
Avg Dump Rate (k/s)       870.0      808.9     1742.9

Tape Time (hrs:min)        3:23       2:56       0:27
Tape Size (meg)         13542.5    11766.5     1776.0
Tape Used (%)             116.7      101.4       15.3   (level:#disks
...)
Filesystems Taped            13          4          9   (1:9)
Avg Tp Write Rate (k/s)  1137.5     1141.5     1111.7


FAILED AND STRANGE DUMP DETAILS:

/-- tarjan     /opt lev 0 FAILED [data timeout]
sendbackup: start [tarjan:/opt level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? 
\--------

/-- turing     /cs/turing/home2 lev 1 FAILED [data timeout]
sendbackup: start [turing:/cs/turing/home2 level 1]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? 
\--------

/-- turing     /opt lev 0 FAILED [data timeout]
sendbackup: start [turing:/opt level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc
|/usr/sbin/ufsrestore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Wed Aug 14 01:54:45 2002
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/rdsk/c0t0d0s5 (turing:/opt) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 6949216 blocks (3393.17MB) on 0.05 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
| 
? gzip: stdout: Broken pipe
? sendbackup: index tee cannot write [Broken pipe]
|   DUMP: Broken pipe
|   DUMP: The ENTIRE dump is aborted.
? index returned 1
sendbackup: error [/usr/sbin/ufsdump returned 3, compress returned 1]
\--------

/-- tarjan     /usr/oracle lev 0 FAILED [data timeout]
sendbackup: start [tarjan:/usr/oracle level 0]
sendbackup: info BACKUP=/usr/sbin/ufsdump
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc
|/usr/sbin/ufsrestore -f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
|   DUMP: Writing 32 Kilobyte records
|   DUMP: Date of this level 0 dump: Wed Aug 14 01:55:12 2002
|   DUMP: Date of last level 0 dump: the epoch
|   DUMP: Dumping /dev/rdsk/c0t0d0s3 (tarjan:/usr) to standard output.
|   DUMP: Mapping (Pass I) [regular files]
|   DUMP: Mapping (Pass II) [directories]
|   DUMP: Estimated 6191034 blocks (3022.97MB) on 0.04 tapes.
|   DUMP: Dumping (Pass III) [directories]
|   DUMP: Dumping (Pass IV) [regular files]
| 
? gzip: stdout: Broken pipe
? sendbackup: index tee cannot write [Broken pipe]
|   DUMP: Broken pipe
|   DUMP: The ENTIRE dump is aborted.
? index returned 1
sendbackup: error [/usr/sbin/ufsdump returned 3, compress returned 1]
\--------

/-- turing     /usr lev 0 STRANGE
sendbackup: start [turing:/usr level 0]
sendbackup: info BACKUP=/usr/local/bin/tar
sendbackup: info RECOVER_CMD=/usr/local/bin/gzip -dc |/usr/local/bin/tar
-f... -
sendbackup: info COMPRESS_SUFFIX=.gz
sendbackup: info end
? gtar: ./local/var/amanda/gnutar-lists/tarjan_dbfiles_1.new: Warning:
Cannot stat: No such file or directory
| Total bytes written: 7120732160 (6.6GB, 1.7MB/s)
sendbackup: size 6953840
sendbackup: end
\--------


NOTES:
  planner: Adding new disk tarjan:/usr/oracle.
  planner: Incremental of turing:/cs/turing/facstaff1 bumped to level 3.
  taper: tape daily09 kb 11904800 fm 13 writing file: No space left on
device
  taper: retrying turing:/cs/turing/facstaff1.0 on new tape: [writing
file: No space left on device]
  taper: tape daily10 kb 8512896 fm 1 [OK]


DUMP SUMMARY:
                                     DUMPER STATS            TAPER STATS
HOSTNAME     DISK        L ORIG-KB OUT-KB COMP% MMM:SS  KB/s MMM:SS 
KB/s
-------------------------- ---------------------------------
------------
bachman      /etc        1      90     32  35.6   0:01  62.0   0:08  
8.2
bachman      /var/mail   0  491100 188864  38.5   9:41 325.0  
2:471131.8
suman        /usr        0 2135250 731424  34.3   9:011352.0 
10:541119.3
tarjan       /dbfiles    1 16572481657248   --    8:103380.0 
24:141140.2
tarjan       /dblogs     1  158624 158624   --    0:423746.0  
2:251095.1
tarjan       /etc        1      90     32  35.6   0:02  21.3   0:02 
27.3
tarjan       /opt        0 FAILED
---------------------------------------
tarjan       /usr/oracle 0 FAILED
---------------------------------------
turing       -ng/dbfiles 1      32     32   --    0:00  87.3   0:02 
33.0
turing       -ing/dblogs 1      32     32   --    0:00  65.3   0:02 
32.9
turing       -/facstaff1 0 155651008512864  54.7 163:19 868.7
124:041143.6
turing       -ring/home1 1   15720   1312   8.3   7:57   2.8   0:08
163.3
turing       -ring/home2 1 FAILED
---------------------------------------
turing       /etc        1     130     32  24.6   0:01  44.0   0:13  
5.0
turing       /opt        0 FAILED
---------------------------------------
turing       /usr        0 69538402615648  37.6  66:14 658.2 
38:111141.7
turing       /var        1   16970    992   5.8   0:30  33.1   0:02
444.0

(brought to you by Amanda version 2.4.2p2)

Thanks again!
Jim




> -- 
> Joshua Baker-LePain
> Department of Biomedical Engineering
> Duke University
> 



<Prev in Thread] Current Thread [Next in Thread>