Veritas-bu

[Veritas-bu] Problem with Netbackup 3.2

2003-06-02 16:34:46
Subject: [Veritas-bu] Problem with Netbackup 3.2
From: david AT datastaff DOT com (David A. Chapa)
Date: Mon, 2 Jun 2003 15:34:46 -0500
Also check the bp.conf file for the master and media servers if they are
separate (sorry I didn't read the entire thread...so if this has been
covered, my apologies) and make sure you do not have
DISALLOW_SERVER_FILE_WRITES in any of those files.

This will disguise itself as a socket read error.

>-----Original Message-----
>From: veritas-bu-admin AT mailman.eng.auburn DOT edu 
>[mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf Of 
>Brian Blake
>Sent: Monday, June 02, 2003 2:59 PM
>To: Wes Neal; 'Markham, Richard'; veritas-bu AT mailman.eng.auburn DOT edu
>Subject: Re: [Veritas-bu] Problem with Netbackup 3.2
>
>
>Wes-
>
>Given the large number of attempts to connect to a port on the 
>server (bpsched opens a socket connection back to itself to 
>pass admin data), and the fact that it seems to be dipping 
>into the 500-600 range, you may have started to run out of 
>reserved ports. A number of the NetBackup processes use 
>reserved ports for their admin data (bpsched, bpbrm, bptm...), 
>and when you start seeing connections to the lower port 
>numbers (500-600), it could be that you're running out of 
>ports. NetBackup will typically start assigning ports up 
>around 1023 and count down towards 512... The problem is that 
>you could start running into well known services down in the 
>512 range, which can cause NetBackup to puke.
>
>Check out technote 230050... You may need to decrease your 
>tcp_close_wait_interval or tcp_time_wait_interval (depending 
>on the OS on the master server). That might do the trick for you...
>
>B-
>
>On 6/2/03 1:47 PM, "Wes Neal" <wes.neal AT mci DOT com> wrote:
>
>> No change in IP for quite some time
>> There is only one server and it has been rebooted multiple 
>times since 
>> this problem occured. # more bp.conf
>> SERVER = backup
>> CLIENT_NAME = backup
>> 
>> Which is my server name.  Any idea what else could be causing the 
>> error 16?
>> 
>> This is a single server with the L1000 hooked directly to it, so I 
>> would think the network would not be involved.
>> 
>> 
>> 
>> ----
>> Wes Neal
>> MCI GNMSS - OSS
>> (813)829-6915
>> VNET: 838-6915
>> Sametime: wes.neal
>> 
>> 
>> -----Original Message-----
>> From: veritas-bu-admin AT mailman.eng.auburn DOT edu
>> [mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf 
>Of Markham, 
>> Richard
>> Sent: Monday, June 02, 2003 11:33 AM
>> To: 'veritas-bu AT mailman.eng.auburn DOT edu'
>> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
>> 
>> 
>> 
>> i'd say its more of a connection problem as opposed to drive problem.
>> 
>> (1) has there been any change in ip
>> (2) have you bounced services on both master and media
>> (3) are the appropriate server= entries in bp.conf on both master 
>> media
>> 
>> -----Original Message-----
>> From: Wes Neal [ mailto:wes.neal AT mci DOT com <mailto:wes.neal AT mci DOT 
>> com> ]
>> Sent: Monday, June 02, 2003 11:27 AM
>> To: veritas-bu AT mailman.eng.auburn DOT edu
>> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
>> 
>> 
>> Says it is up:
>> 
>> Device Robot Drive       Robot                    Drive   Device
>> Second 
>> Type     Num Index  Type DrNum Status  Comment    Name    Path
>> Device Path 
>> robot      0    -    TLD    -       -  -          -       
>/dev/sg/c2t0l0
>> 
>> drive    -    0    dlt    1      UP  -          Drive0  /dev/rmt/0cbn
>> 
>> 
>> ----
>> Wes Neal 
>> MCI GNMSS - OSS 
>> (813)829-6915 
>> VNET: 838-6915 
>> Sametime: wes.neal
>> 
>> 
>> -----Original Message-----
>> From: Teklu, Daniel [ mailto:daniel.teklu AT tfn DOT com 
>> <mailto:daniel.teklu AT tfn DOT com> ]
>> Sent: Monday, June 02, 2003 11:21 AM
>> To: 'Wes Neal'; veritas-bu AT mailman.eng.auburn DOT edu
>> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
>> 
>> 
>> Check if the drive is down? I get that when my drive is down. Check 
>> the status by
>> 
>> # ./tpconfig -l
>> 
>> if it is down, try to bring it up by
>> 
>> # ./vmoprcmd -up Drive_Index
>> 
>> Good Luck
>> Daniel 
>> 
>> -----Original Message-----
>> From: Wes Neal [ mailto:wes.neal AT mci DOT com <mailto:wes.neal AT mci DOT 
>> com> ]
>> Sent: Monday, June 02, 2003 11:10 AM
>> To: veritas-bu AT mailman.eng.auburn DOT edu
>> Subject: [Veritas-bu] Problem with Netbackup 3.2
>> 
>> 
>> Howdy people, hopefully someone can give me some insight on this.  I 
>> have had this implementation of Netbackup 3.2 running for about 3 
>> hours without a hitch and suddenly about 4 weeks ago it quit 
>working.  
>> I am running Solaris 7.  Basically whatever Netbackup tries to do it 
>> gets an error 219 or 213.  Here is a snippet from the bpsched log:
>> 
>> 10:38:24 [1787] <4> bpsched: INITIATING...
>> 10:38:24 [1787] <2> logparams: /usr/openv/netbackup/bin/bpsched 
>> 10:38:24
>> 
>> [1787] <4> bpsched_main: wait_on_que=0, timeout_in_que=36000, 
>> reread_interval=30 0,queue_on_error=0, 
>bptm_query_timeout=480 10:38:24 
>> [1787] <4> db_ATTDEFS: Class_att_defs protocol 3.2 10:38:24 
>[1787] <4>
>> bpsched_main: VSMInit () failed: 2d
>> 10:38:24 [1787] <4> ?:   post process images:   yes
>> 10:38:24 [1787] <4> ?:   keep tir:   yes
>> 10:38:25 [1787] <2> getsockconnected: Connect to backup on port 689 
>> 10:38:26 [1787] <2> getsockconnected: Connect to backup on port 529 
>> 10:38:27 [1787] <2> getsockconnected: Connect to backup on port 947 
>> 10:38:28 [1787] <2> getsockconnected: Connect to backup on port 783 
>> 10:38:29 [1787] <2> getsockconnected: Connect to backup on port 593 
>> 10:38:30 [1787] <2> getsockconnected: Connect to backup on port 1017 
>> 10:38:31 [1787] <2> getsockconnected: Connect to backup on port 787 
>> 10:38:32 [1787] <2> getsockconnected: Connect to backup on port 695 
>> 10:38:33 [1787] <2> getsockconnected: Connect to backup on port 561 
>> 10:38:34 [1787] <2> getsockconnected: Connect to backup on port 641 
>> 10:38:35 [1787] <2> getsockconnected: Connect to backup on port 947 
>> 10:38:36 [1787] <2> getsockconnected: Connect to backup on port 799 
>> 10:38:37 [1787] <2> getsockconnected: Connect to backup on port 721 
>> 10:38:38 [1787] <2> getsockconnected: Connect to backup on port 873 
>> 10:38:39 [1787] <2> getsockconnected: Connect to backup on port 787 
>> 10:38:40 [1787] <2> getsockconnected: Connect to backup on port 647 
>> 10:38:41 [1787] <2> getsockconnected: Connect to backup on port 945 
>> 10:38:42 [1787] <2> getsockconnected: Connect to backup on port 817 
>> 10:38:43 [1787] <2> getsockconnected: Connect to backup on port 947 
>> 10:38:44 [1787] <2> getsockconnected: Connect to backup on port 815 
>> 10:38:45 [1787] <2> getsockconnected: Connect to backup on port 849 
>> 10:38:46 [1787] <2> getsockconnected: Connect to backup on port 729 
>> 10:38:47 [1787] <2> getsockconnected: Connect to backup on port 787 
>> 10:38:48 [1787] <2> getsockconnected: Connect to backup on port 663 
>> 10:38:49 [1787] <2> getsockconnected: Connect to backup on port 817 
>> 10:38:50 [1787] <2> getsockconnected: Connect to backup on port 929 
>> 10:38:51 [1787] <2> getsockconnected: Connect to backup on port 947 
>> 10:38:52 [1787] <2> getsockconnected: Connect to backup on port 831 
>> 10:38:53 [1787] <2> getsockconnected: Connect to backup on port 977 
>> 10:38:54 [1787] <2> getsockconnected: Connect to backup on port 585 
>> 10:38:55 [1787] <2> getsockconnected: Connect to backup on port 787 
>> 10:38:56 [1787] <2> getsockconnected: Connect to backup on port 871 
>> 10:38:56 [1787] <16> getsockconnected: exceeded timeout of 
>30 seconds 
>> 10:38:56 [1787] <2> getsockconnected: sockfd:-1 timo:32 10:38:56 
>> [1787] <16> bpcr_connect: Can't connect to client backup 10:38:56 
>> [1787] <16>
>> start_bptm: connection refused by host backup 10:38:56 [1787] <16>
>> get_stunits: get_num_avail_drives failed with stat 204 
>10:38:56 [1787]
>> <4> get_db_info: no available storage units 10:38:56 [1787] <8>
>> bpsched_main: failed getting database information 10:38:56 
>[1787] <16>
>> log_in_errorDB: scheduler exiting - no storage units 
>available for u se
>> (213) 10:38:56 [1787] <16> bpsched: scheduler exiting - no 
>storage units
>> 
>> available for use (213
>> )
>> 
>> I have not seen these before.  Any help would be much appreciated.
>> 
>> Thanks
>> Wes 
>> 
>> 
>> ----
>> Wes Neal 
>> 
>> _______________________________________________
>> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>
>> 
>> _______________________________________________
>> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
>> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>
>> 
>> 
>
>-- 
>Brian Blake
>Professional Services Organization
>VERITAS Software
>brian.blake AT veritas DOT com
>
>_______________________________________________
>Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu 
>http://mailman.eng.auburn.edu/mailman/listinfo/>veritas-bu
>