Amanda-Users

amrecover remotely: TCP/IP connection stalls

2003-08-12 16:09:54
Subject: amrecover remotely: TCP/IP connection stalls
From: Andreas Ntaflos <ant AT overclockers DOT at>
To: amanda-users AT amanda DOT org, amanda-hackers AT amanda DOT org
Date: Wed, 23 Jul 2003 14:07:45 +0200
Hello list(s),

  First of all, sorry for crossposting but I think this is both a
  question for -users as well as -hackers. I hope I get some response
  since I've been through the archives so thoroughly I get a headache
  only from thinking of them...

  This is the setup of Amanda in our case:

  The amanda backup server is called `backup' and is connected to the
  backup client called `mobilkom' via a 100Mbit/s crossover cable.
  
  Both hosts run the services needed for backup operations (amindex,
  amidxtape, amandad, etc, etc). We have set the server up for tapeless 
  operation, using a `small' 600GB IDE disk array instead of tapes. Dumping 
  works really well, both with dump and with tar. The `tapes' get cycled
  through the chg-multi script and all is well.

  Both hosts run Amanda 2.4.4p1 and both on RedHat Linux 8.0 with kernel
  version 2.4.18. 

  But a backup is worth nothing without the ability to restore it, hm?

  Here begins the problem:

  Using amrecover from `mobilkom' to access the index/catalogue on the
  backup server (`backup') works fine, we can select and add files for
  extraction and recovery. But once the point is reached to actually
  extract the files from backup back to mobilkom the connection stalls
  and eventually dies out after hours.

  Here's what we do:
--------------------
  mobilkom% amrecover -C normal -s backup -t backup

  AMRECOVER Version 2.4.4p1. Contacting server on backup ...
  220 backup AMANDA index server (2.4.4p1) ready.
  200 Access OK
  Setting restore date to today (2003-07-23)
  200 Working date set to 2003-07-23.
  200 Config set to normal.
  200 Dump host set to mobilkom.
  Trying disk /d ...
  Trying disk md0 ...
  $CWD '/d' is on disk 'md0' mounted at '/d'.
  200 Disk set to md0.
  /d
  amrecover> ls
  2003-07-3 testdir1/
  amrecover> add testdir1
  Added dir /testdir1 at date 2003-07-2
  amrecover> extract
  
  Extracting files using tape drive file:/dump/normal03 on host backup.
  The following tapes are needed: normal03

  Restoring files into directory /d
  Continue [?/Y/n]? y

  Extracting files using tape drive file:/dump/normal03 on host backup.
  Load tape normal03 now
  Continue [?/Y/n/s/t]? y

  [here we wait for hours]
-------------------  
  
  And here it sits forever until it dies. tcpdump shows that backup
  tries to reach mobilkom and awaits ACK but gets nothing and only
  retransmits the request packets. mobilkom eventually ACKs something,
  but definately nothing concerning the ongoing connection. I will attach
  the log output of tcpdump, starting at the moment amrecover started on
  mobilkom.

  The point is that we need amrecover working, or else the system would
  never be able to restore 300GB worth of user data in the event of any
  kind of emergancy, since extracting the files on backup itself and
  then transferring them to mobilkom is not an option.

  Can somebody give us any input? This does not seem to be a frequently
  encountered problem, or my archive-searching abilities and google have
  died, which is not the case (hopefully).

  Any help is greatly appreciated. Really.

Thanks in advance!  

PS: I think this problem is related to the one discussed in this message
(from the NetBSD-current mailing list though):
http://mail-index.netbsd.org/current-users/2002/07/08/0002.html

He there has the problem that the TCP connection times out via the
lookback interface while it works when doing amrecover remotely. Our
problem here goes vice versa.

PPS: If any other logfiles are needed just tell me and I'll attach them,
we have lots of them here (amidxtaped.timestamp.debug, etc).
-- 
        Andreas "ant" Ntaflos | "A cynic is a man who knows the price of
        ant AT overclockers DOT at   | everything, and the value of nothing."
        Vienna, AUSTRIA       |                              Oscar Wilde

Attachment: tcpcump.030723_124716.out
Description: Text document

<Prev in Thread] Current Thread [Next in Thread>
  • amrecover remotely: TCP/IP connection stalls, Andreas Ntaflos <=