Amanda-Users

Re: RHEL5 client problems? (2.5.0p2)

2007-09-18 13:21:46
Subject: Re: RHEL5 client problems? (2.5.0p2)
From: JB Segal <jb AT smarterliving DOT com>
To: amanda-users AT amanda DOT org
Date: Tue, 18 Sep 2007 13:18:34 -0400
No one? Please?

Quoth JB Segal (jb AT smarterliving DOT com):
> (I'm betting this never got through due to the corporate domain change
> and that I'm not on the list at the address in my .sig. <sigh>
> Ah well. Let's try this. There are 2 messages here, top-posted, with my
> original problems below my current problems. JB)
> 
> Sadly, I changed my RHEL5 clients over to 2.5.2 from zmanda.org and that
> didn't fix most of my problems.
> 
> I've got only one host reporting 'MISSING' now (and the one that isn't
> MISSING _is_ working, but still 2 'FAILED'.
> 
> The most frustrating thing is that I have all but one (the remaining
> 'MISSING' one) passing amcheck.
> 
> One of the failed ones has a fine looking debug file, up until it fails:
> 
> # cat sendbackup.20070917010550.debug
> sendbackup: debug 1 pid 16368 ruid 628 euid 628: start at Mon Sep 17 01:05:50 
> 2007
> sendbackup: version 2.5.2p1
> Could not open conf file "/etc/amanda/amanda-client.conf": No such file or 
> directory
> Could not open conf file "/etc/amanda/smarterliving/amanda-client.conf": No 
> such file or directory
> sendbackup: debug 1 pid 16368 ruid 628 euid 628: rename at Mon Sep 17 
> 01:05:50 2007
>   sendbackup req: <DUMP /usr  0 1970:1:1:0:0:0 OPTIONS |;auth=BSD;index;>
>   parsed request as: program `DUMP'
>                      disk `/usr'
>                      device `/usr'
>                      level 0
>                      since 1970:1:1:0:0:0
>                      options `|;auth=BSD;index;'
> sendbackup: start: FAILEDHost:/usr lev 0
> sendbackup: time 0.001: dumping device '/dev/mapper/VolGroup00-LogVol03' with 
> 'ext3'
> sendbackup: time 0.002: started index creator: "/sbin/restore -tvf -
> 2>&1 | sed -e '
> s/^leaf[        ]*[0-9]*[       ]*\.//
> t
> /^dir[  ]/ {
> s/^dir[         ]*[0-9]*[       ]*\.//
> s%$%/%
> t
> }
> d
> '"
> sendbackup: time 0.002: spawning /sbin/dump in pipeline
> sendbackup: time 0.002: argument list: dump 0usf 1048576 - 
> /dev/mapper/VolGroup00-LogVol03
> sendbackup: time 0.003: started backup
> sendbackup: time 0.011:  91:  normal(|):   DUMP: Date of this level 0 dump: 
> Mon Sep 17 01:05:50 2007
> sendbackup: time 0.012:  91:  normal(|):   DUMP: Dumping 
> /dev/mapper/VolGroup00-LogVol03 (/usr) to standard output
> sendbackup: time 6.096:  91:  normal(|):   DUMP: Label: none
> sendbackup: time 6.097:  91:  normal(|):   DUMP: Writing 10 Kilobyte records
> sendbackup: time 6.097:  91:  normal(|):   DUMP: mapping (Pass I) [regular 
> files]
> sendbackup: time 6.655:  91:  normal(|):   DUMP: mapping (Pass II) 
> [directories]
> sendbackup: time 6.659:  91:  normal(|):   DUMP: estimated 3419329 blocks.
> sendbackup: time 6.659:  91:  normal(|):   DUMP: Volume 1 started with block 
> 1 at: Mon Sep 17 01:05:57 2007 sendbackup: time 90.017: index tee cannot 
> write [Broken pipe]
> sendbackup: time 90.017: pid 16370 finish time Mon Sep 17 01:07:20 2007
> 
> while the remaining 'MISSING' host looks pretty similar as it fails
> amcheck:
> 
> # cat amandad.20070917105118.debug
> amandad: debug 1 pid 31792 ruid 628 euid 628: start at Mon Sep 17 10:51:18 
> 2007
> Could not open conf file "/etc/amanda/amanda-client.conf": No such file or 
> directory
> amandad: time 0.000: security_getdriver(name=BSD) returns 0x45abe0
> amandad: version 2.5.2p1
> amandad: time 0.000: build: VERSION="Amanda-2.5.2p1"
> amandad: time 0.000:        BUILT_DATE="Wed Jun 6 21:18:37 PDT 2007"
> amandad: time 0.000:        BUILT_MACH="Linux rhel5rc-build 2.6.18-8.el5 #1 
> SMP Fri Jan 26 14:15:21 EST 2007 i686 athlon i386 GNU/Linux"
> amandad: time 0.000:        CC="gcc"
> amandad: time 0.000:        CONFIGURE_COMMAND="'./configure'
> '--build=i386-redhat-linux' '--prefix=/usr' '--bindir=/usr/bin'
> '--sbindir=/usr/sbin' '--libexecdir=/usr/lib/amanda'
> '--datadir=/usr/share' '--sysconfdir=/etc'
> '--sharedstatedir=/var/lib/amanda' '--localstatedir=/var/lib/amanda'
> '--libdir=/usr/lib' '--includedir=/usr/include' '--infodir=/usr/info'
> '--mandir=/usr/share/man' '--with-gnutar=/bin/tar'
> '--with-gnutar-listdir=/var/lib/amanda/gnutar-lists'
> '--with-dumperdir=/usr/lib/amanda' '--with-index-server=localhost'
> '--with-tape-server=localhost' '--with-user=amandabackup'
> '--with-group=disk' '--with-owner=paddy' '--with-fqdn'
> '--with-bsd-security' '--with-bsdtcp-security' '--with-bsdudp-security'
> '--with-ssh-security' '--with-assertions'"
> amandad: time 0.000: paths: bindir="/usr/bin" sbindir="/usr/sbin"
> amandad: time 0.000:        libexecdir="/usr/lib/amanda" 
> mandir="/usr/share/man"
> amandad: time 0.000:        AMANDA_TMPDIR="/tmp/amanda" 
> AMANDA_DBGDIR="/tmp/amanda"
> amandad: time 0.000:        CONFIG_DIR="/etc/amanda" DEV_PREFIX="/dev/"
> amandad: time 0.000:        RDEV_PREFIX="/dev/r" DUMP="/sbin/dump"
> amandad: time 0.000:        RESTORE="/sbin/restore" VDUMP=UNDEF VRESTORE=UNDEF
> amandad: time 0.000:        XFSDUMP=UNDEF XFSRESTORE=UNDEF VXDUMP=UNDEF 
> VXRESTORE=UNDEF
> amandad: time 0.000:        SAMBA_CLIENT="/usr/bin/smbclient" 
> GNUTAR="/bin/tar"
> amandad: time 0.000:        COMPRESS_PATH="/bin/gzip" 
> UNCOMPRESS_PATH="/bin/gzip"
> amandad: time 0.000:        LPRCMD="/usr/bin/lpr" MAILER="/usr/bin/Mail"
> amandad: time 0.000:  listed_incr_dir="/var/lib/amanda/gnutar-lists"
> amandad: time 0.000: defs:  DEFAULT_SERVER="localhost" 
> DEFAULT_CONFIG="DailySet1"
> amandad: time 0.000:        DEFAULT_TAPE_SERVER="localhost" HAVE_MMAP 
> NEED_STRSTR
> amandad: time 0.000:        HAVE_SYSVSHM LOCKING=POSIX_FCNTL SETPGRP_VOID 
> ASSERTIONS
> amandad: time 0.000:        DEBUG_CODE AMANDA_DEBUG_DAYS=4 BSD_SECURITY 
> RSH_SECURITY
> amandad: time 0.000:        USE_AMANDAHOSTS CLIENT_LOGIN="amandabackup" 
> FORCE_USERID
> amandad: time 0.000:        HAVE_GZIP COMPRESS_SUFFIX=".gz" 
> COMPRESS_FAST_OPT="--fast"
> amandad: time 0.000:        COMPRESS_BEST_OPT="--best" UNCOMPRESS_OPT="-dc"
> amandad: time 0.000: dgram_recv(dgram=0x45cb84, timeout=0, fromaddr=0x46cb70)
> amandad: time 0.000: (sockaddr_in *)0x46cb70 = { 2, 847, 69.25.205.8 }
> amandad: time 0.000: security_handleinit(handle=0x8053230, driver=0x45abe0 
> (BSD))
> amandad: time 0.000: accept recv REQ pkt:
> <<<<<
> SERVICE noop
> OPTIONS features=fffffeff9ffeffffff7f;
> >>>>>
> amandad: time 0.000: creating new service: noop
> PTIONS features=fffffeff9ffeffffff7f;
> 
> amandad: time 0.001: sending ACK pkt:
> <<<<<
> >>>>>
> amandad: time 0.001: dgram_send_addr(addr=0x8053250, dgram=0x45cb84)
> amandad: time 0.001: (sockaddr_in *)0x8053250 = { 2, 847, 69.25.205.8 }
> amandad: time 0.001: dgram_send_addr: 0x45cb84->socket = 0
> amandad: time 0.002: sending REP pkt:
> <<<<<
> OPTIONS features=ffffffff9ffeffffffff00;
> >>>>>
> 
>  . . . which continues until
> 
> amandad: time 49.243: timeout
> amandad: time 49.243: timeout waiting for ACK for our REP
> amandad: time 49.243: security_close(handle=0x8053230, driver=0x45abe0 (BSD))
> amandad: time 59.244: pid 31792 finish time Mon Sep 17 10:52:18 2007
> 
> Anything here look familiar to anyone?
> 
> Thanks!
> JB
> 
> Quoth JB Segal (jb AT smartertravelmedia DOT com):
> > Has anyone else had problems getting the rhel5-distributed version of
> > amanda to work?
> > 
> > My server is a rhel4 box, using the amanda-packaged
> > amanda-backup_server-2.5.1p1-1.rhel4.
> > 
> > I have many clients using RH packaged RHEL4 rpms (which exist as
> > amanda-client-2.4.4p3-1 and amanda-2.4.4p3-1) and some older (RH9,
> > 2.4.3 boxes) and they all work just fine.
> > 
> > I have _1_ RHEL5 box that works fine, too, and honestly, I can't tell
> > what's different about it, from the 4-5 that don't.
> > 
> > All clients have the same .amandahosts entries.
> > 
> > All the rhel5 clients have iptables off at this time.
> > 
> > They're all behind the same firewall, and none of them have any special
> > case entries in said FW. The server's back there, too.
> > 
> > I managed to get 2 of them amchecking cleanly, eventually, but that
> > seems to only be working occasionally now. This morning's automated
> > amcheck failed on all 4 of the problematic rhel5 clients. The one I
> > manually ran while writing this succeeded for 2, failed for 2.
> > 
> > On said clients, running amcheck correctly talks to xinetd and xinetd
> > correctly launches amandad.
> > 
> > The debug files for amandad look like:
> > amandad: debug 1 pid 22449 ruid 33 euid 33: start at Thu Sep 13 12:11:40 
> > 2007
> > amandad: version 2.5.0p2
> > amandad: build: VERSION="Amanda-2.5.0p2"
> >  (plus 4 lines of build info - date/mach/CC/Configure_Command)
> > amandad: paths: bindir="/usr/bin" sbindir="/usr/sbin"
> >  (plus many more lines of paths)
> > amandad: defs:  DEFAULT_SERVER="amandahost" DEFAULT_CONFIG="DailySet1"
> >  (plus many more)
> > amandad: time 30.057: pid 22449 finish time Thu Sep 13 12:12:10 2007
> > 
> > But most problematically, of course, is that nothing's getting backed up
> > on any of the 4. The run output says:
> > 
> > FAILURE AND STRANGE DUMP SUMMARY:                                           
> >                   
> >   problemhost1      /var   lev 0  FAILED [cannot read header: got 0 instead 
> > of 32768]               
> >   problemhost2    /      lev 0  FAILED [cannot read header: got 0 instead 
> > of 32768]               
> >   problemhost1      /usr   lev 0  FAILED [cannot read header: got 0 instead 
> > of 32768]               
> >   problemhost2    /boot  lev 0  FAILED [cannot read header: got 0 instead 
> > of 32768]               
> >   problemhost1      /usr   lev 0  FAILED [too many dumper retry: "[could 
> > not connect DATA stream:   
> > can't connect stream to problemhost1.domain.com port -13818: Connection 
> > timed out]"]         
> >   problemhost1      /usr   lev 0  FAILED [cannot read header: got 0 instead 
> > of 32768]               
> >   problemhost2    /boot  lev 0  FAILED [too many dumper retry: "[could not 
> > connect DATA stream:   
> > can't connect stream to problemhost2.domain.com port -30149: Connection 
> > timed out]"]       
> > 
> > . . . and so on for every DLE, and also
> > 
> >   problemhost3      /boot  RESULTS MISSING                                  
> >                         
> >   problemhost3      /      RESULTS MISSING                                  
> >                         
> >   problemhost3      /var   RESULTS MISSING                                  
> >                         
> >   problemhost3      /usr   RESULTS MISSING                                  
> >                         
> >   problemhost3      /home  RESULTS MISSING                                  
> >                         
> >   problemhost4  /boot  RESULTS MISSING                                      
> >                     
> >   problemhost4  /      RESULTS MISSING                                      
> >                     
> >   problemhost4  /usr   RESULTS MISSING                                      
> >                     
> >   problemhost4  /var   RESULTS MISSING                                      
> >                     
> >   problemhost4  /home  RESULTS MISSING                                      
> >                     
> >   planner: ERROR Request to problemhost4 failed: timeout waiting for ACK    
> >                     
> >   planner: ERROR Request to problemhost3 failed: timeout waiting for ACK    
> >                         
> > 
> > The summary gives 'FAILED' for 1 and 2, and 'MISSING' for 3 and 4.
> > 
> > I swear, all 4 hosts are configured the same, and all are configured
> > the same as the working 2.5.0 host (which is REALLY weird) and as the
> > working 2.4.x hosts.
> > 
> > What am I missing? I utterly expect PEBCAK, but I can't see what it is.
> > 
> > Help? Please??
> > Thanks!
> > JB
> --
> JB Segal                 617-886-5575            www.smartertravel.com
> Systems/Network Admin.   465 Medford St. Ste 400 www.bookingbuddy.com
> Smarter Travel Media LLC Boston, MA 02129        www.tripmania.com
--
JB Segal                 617-886-5575            www.smartertravel.com
Systems/Network Admin.   465 Medford St. Ste 400 www.bookingbuddy.com
Smarter Travel Media LLC Boston, MA 02129        www.tripmania.com

<Prev in Thread] Current Thread [Next in Thread>