Veritas-bu

[Veritas-bu] Problem with Netbackup 3.2

2003-06-02 15:58:58
Subject: [Veritas-bu] Problem with Netbackup 3.2
From: brian.blake AT veritas DOT com (Brian Blake)
Date: Mon, 02 Jun 2003 15:58:58 -0400
Wes-

Given the large number of attempts to connect to a port on the server
(bpsched opens a socket connection back to itself to pass admin data), and
the fact that it seems to be dipping into the 500-600 range, you may have
started to run out of reserved ports. A number of the NetBackup processes
use reserved ports for their admin data (bpsched, bpbrm, bptm...), and when
you start seeing connections to the lower port numbers (500-600), it could
be that you're running out of ports. NetBackup will typically start
assigning ports up around 1023 and count down towards 512... The problem is
that you could start running into well known services down in the 512 range,
which can cause NetBackup to puke.

Check out technote 230050... You may need to decrease your
tcp_close_wait_interval or tcp_time_wait_interval (depending on the OS on
the master server). That might do the trick for you...

B-

On 6/2/03 1:47 PM, "Wes Neal" <wes.neal AT mci DOT com> wrote:

> No change in IP for quite some time
> There is only one server and it has been rebooted multiple times since
> this problem occured.
> # more bp.conf
> SERVER = backup
> CLIENT_NAME = backup
> 
> Which is my server name.  Any idea what else could be causing the error
> 16?
> 
> This is a single server with the L1000 hooked directly to it, so I would
> think the network would not be involved.
> 
> 
> 
> ----
> Wes Neal
> MCI GNMSS - OSS
> (813)829-6915
> VNET: 838-6915
> Sametime: wes.neal
> 
> 
> -----Original Message-----
> From: veritas-bu-admin AT mailman.eng.auburn DOT edu
> [mailto:veritas-bu-admin AT mailman.eng.auburn DOT edu] On Behalf Of Markham,
> Richard
> Sent: Monday, June 02, 2003 11:33 AM
> To: 'veritas-bu AT mailman.eng.auburn DOT edu'
> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
> 
> 
> 
> i'd say its more of a connection problem as opposed to drive problem.
> 
> (1) has there been any change in ip
> (2) have you bounced services on both master and media
> (3) are the appropriate server= entries in bp.conf on both master media
> 
> -----Original Message-----
> From: Wes Neal [ mailto:wes.neal AT mci DOT com <mailto:wes.neal AT mci DOT 
> com> ]
> Sent: Monday, June 02, 2003 11:27 AM
> To: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
> 
> 
> Says it is up: 
> 
> Device Robot Drive       Robot                    Drive   Device
> Second 
> Type     Num Index  Type DrNum Status  Comment    Name    Path
> Device Path 
> robot      0    -    TLD    -       -  -          -       /dev/sg/c2t0l0
> 
> drive    -    0    dlt    1      UP  -          Drive0  /dev/rmt/0cbn
> 
> 
> ---- 
> Wes Neal 
> MCI GNMSS - OSS 
> (813)829-6915 
> VNET: 838-6915 
> Sametime: wes.neal
> 
> 
> -----Original Message-----
> From: Teklu, Daniel [ mailto:daniel.teklu AT tfn DOT com
> <mailto:daniel.teklu AT tfn DOT com> ]
> Sent: Monday, June 02, 2003 11:21 AM
> To: 'Wes Neal'; veritas-bu AT mailman.eng.auburn DOT edu
> Subject: RE: [Veritas-bu] Problem with Netbackup 3.2
> 
> 
> Check if the drive is down? I get that when my drive is down. Check the
> status by 
> 
> # ./tpconfig -l 
> 
> if it is down, try to bring it up by
> 
> # ./vmoprcmd -up Drive_Index
> 
> Good Luck 
> Daniel 
> 
> -----Original Message-----
> From: Wes Neal [ mailto:wes.neal AT mci DOT com <mailto:wes.neal AT mci DOT 
> com> ]
> Sent: Monday, June 02, 2003 11:10 AM
> To: veritas-bu AT mailman.eng.auburn DOT edu
> Subject: [Veritas-bu] Problem with Netbackup 3.2
> 
> 
> Howdy people, hopefully someone can give me some insight on this.  I
> have had this implementation of Netbackup 3.2 running for about 3 hours
> without a hitch and suddenly about 4 weeks ago it quit working.  I am
> running Solaris 7.  Basically whatever Netbackup tries to do it gets an
> error 219 or 213.  Here is a snippet from the bpsched log:
> 
> 10:38:24 [1787] <4> bpsched: INITIATING...
> 10:38:24 [1787] <2> logparams: /usr/openv/netbackup/bin/bpsched 10:38:24
> 
> [1787] <4> bpsched_main: wait_on_que=0, timeout_in_que=36000,
> reread_interval=30 0,queue_on_error=0, bptm_query_timeout=480 10:38:24
> [1787] <4> db_ATTDEFS: Class_att_defs protocol 3.2 10:38:24 [1787] <4>
> bpsched_main: VSMInit () failed: 2d
> 10:38:24 [1787] <4> ?:   post process images:   yes
> 10:38:24 [1787] <4> ?:   keep tir:   yes
> 10:38:25 [1787] <2> getsockconnected: Connect to backup on port 689
> 10:38:26 [1787] <2> getsockconnected: Connect to backup on port 529
> 10:38:27 [1787] <2> getsockconnected: Connect to backup on port 947
> 10:38:28 [1787] <2> getsockconnected: Connect to backup on port 783
> 10:38:29 [1787] <2> getsockconnected: Connect to backup on port 593
> 10:38:30 [1787] <2> getsockconnected: Connect to backup on port 1017
> 10:38:31 [1787] <2> getsockconnected: Connect to backup on port 787
> 10:38:32 [1787] <2> getsockconnected: Connect to backup on port 695
> 10:38:33 [1787] <2> getsockconnected: Connect to backup on port 561
> 10:38:34 [1787] <2> getsockconnected: Connect to backup on port 641
> 10:38:35 [1787] <2> getsockconnected: Connect to backup on port 947
> 10:38:36 [1787] <2> getsockconnected: Connect to backup on port 799
> 10:38:37 [1787] <2> getsockconnected: Connect to backup on port 721
> 10:38:38 [1787] <2> getsockconnected: Connect to backup on port 873
> 10:38:39 [1787] <2> getsockconnected: Connect to backup on port 787
> 10:38:40 [1787] <2> getsockconnected: Connect to backup on port 647
> 10:38:41 [1787] <2> getsockconnected: Connect to backup on port 945
> 10:38:42 [1787] <2> getsockconnected: Connect to backup on port 817
> 10:38:43 [1787] <2> getsockconnected: Connect to backup on port 947
> 10:38:44 [1787] <2> getsockconnected: Connect to backup on port 815
> 10:38:45 [1787] <2> getsockconnected: Connect to backup on port 849
> 10:38:46 [1787] <2> getsockconnected: Connect to backup on port 729
> 10:38:47 [1787] <2> getsockconnected: Connect to backup on port 787
> 10:38:48 [1787] <2> getsockconnected: Connect to backup on port 663
> 10:38:49 [1787] <2> getsockconnected: Connect to backup on port 817
> 10:38:50 [1787] <2> getsockconnected: Connect to backup on port 929
> 10:38:51 [1787] <2> getsockconnected: Connect to backup on port 947
> 10:38:52 [1787] <2> getsockconnected: Connect to backup on port 831
> 10:38:53 [1787] <2> getsockconnected: Connect to backup on port 977
> 10:38:54 [1787] <2> getsockconnected: Connect to backup on port 585
> 10:38:55 [1787] <2> getsockconnected: Connect to backup on port 787
> 10:38:56 [1787] <2> getsockconnected: Connect to backup on port 871
> 10:38:56 [1787] <16> getsockconnected: exceeded timeout of 30 seconds
> 10:38:56 [1787] <2> getsockconnected: sockfd:-1 timo:32 10:38:56 [1787]
> <16> bpcr_connect: Can't connect to client backup 10:38:56 [1787] <16>
> start_bptm: connection refused by host backup 10:38:56 [1787] <16>
> get_stunits: get_num_avail_drives failed with stat 204 10:38:56 [1787]
> <4> get_db_info: no available storage units 10:38:56 [1787] <8>
> bpsched_main: failed getting database information 10:38:56 [1787] <16>
> log_in_errorDB: scheduler exiting - no storage units available for u se
> (213) 10:38:56 [1787] <16> bpsched: scheduler exiting - no storage units
> 
> available for use (213
> ) 
> 
> I have not seen these before.  Any help would be much appreciated.
> 
> Thanks 
> Wes 
> 
> 
> ---- 
> Wes Neal 
> 
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>
> 
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
> <http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu>
> 
> 

-- 
Brian Blake
Professional Services Organization
VERITAS Software
brian.blake AT veritas DOT com