Veritas-bu

[Veritas-bu] bpbkar processes hung on CLOSE_WAIT on Linux

2005-07-08 03:22:07
Subject: [Veritas-bu] bpbkar processes hung on CLOSE_WAIT on Linux
From: Smita.Agarwal AT ge DOT com (Agarwal, Smita S (GE Consumer Finance))
Date: Fri, 8 Jul 2005 12:52:07 +0530
HI,
 
Am receiving this mails due to some error.... Pls remove me from this mailing 
list.
 
Regards,
Smita

        -----Original Message----- 
        From: veritas-bu-admin AT mailman.eng.auburn DOT edu on behalf of Jeff 
Lightner 
        Sent: Fri 7/8/2005 1:57 AM 
        To: veritas-bu AT mailman.eng.auburn DOT edu 
        Cc: 
        Subject: [Veritas-bu] bpbkar processes hung on CLOSE_WAIT on Linux
        
        

        All,

         

        Can anyone tell me how I can get rid of Netbackup processes hung with 
CLOSE_WAIT status (other than reboot)?

         

        Alternatively can anyone provide definitive information that would 
indicate this is a known issue in the Linux / Netbackup combo weâ??re running 
that would require a reboot?  If so is it fixed in NB 5.1?

         

        DETAILS:

        After looking at Linux forums, Veritasâ?? web site, this forum and 
Google Iâ??m not finding a clear answer.   

         

        We have a Dell PowerEdge 2850 running Netbackup 4.5 FP6 Client software 
under Redhat Linux EL AS 3 (2.4 kernel).  

        The master server is HP-UX 11.11 running same version of Netbackup.

         

        The backups for this have been failing recently (they worked 
previously) giving a 41 Network Connection timed out error.   

         

        On researching I found multiple bpbkar processes hung.  They can not be 
killed with ANY signal (-9, -1, -15 etcâ?¦ and yes I know the names SIGHUP, 
SIGTERM etcâ?¦).

         

        lsof reveals all the sockets are in CLOSE_WAIT.  They all show the 
master server as the other side but on looking at the master the socket does 
not exist any longer.

         

        The CLOSE_WAIT means the other side has closed.   One would expect 
these to go away eventually but I have some that are more than a day old.

         

        There was some discussion of a CLOSE_WAIT bug in a version of xinetd 
older than the one weâ??re running.  Since it is older and I donâ??t see any 
sign this is occurring in other applications it doesnâ??t seem likely this is 
the issue.

         

        Also I found discussion of sysctl from 2002 that talks about netfilter 
and having a parameter for tcp_ct_close_wait_timeout but nothing newer than 
that so Iâ??m not sure it is still relevant.   There is no such parameter on my 
system and Iâ??m not keen on trying netfilter just to get this unless someone 
has done it more recently.    (I do have iptables installed.)

         

         

        Jeffrey C. Lightner

        Unix Systems Administrator

        DS Waters of North America

        678-486-3516

         



<Prev in Thread] Current Thread [Next in Thread>