Veritas-bu

Re: [Veritas-bu] Master server can't see media servers

2007-07-06 13:28:29
Subject: Re: [Veritas-bu] Master server can't see media servers
From: "Martin Ruslan" <mit.martin AT gmail DOT com>
To: "Sponsler, Michael" <Michael.Sponsler AT ngc DOT com>, VERITAS-BU AT mailman.eng.auburn DOT edu
Date: Sat, 7 Jul 2007 00:06:47 +0700
Oh my..
my addresses was listed..
Forgot to remove it.
Btw, if you still not sure where's the problem, and you said that the /usr/openv was mounted from another disk via FC connection,
and your OS solaris 10, I did had almost same environment..

could be the solaris 10 tcp fusion issue. you should turn it off.
until SUN release patches for this issue.
my case was the /usr/openv/netbackup mounted on FC to another disk. And everytime we do the catalog backup, it always failed.

Try to add this on your /etc/system file : "set ip:do_tcp_fusion = 0" (without quotes)
then do the reboot -- -r
after that, my catalog backup was run properly.. :D
hopefully works on you Michael..

@ Ed:
Gee.. I dont know about that new features on 6.0 :)
Do you know how the maximum media server we can have on NBU 6 Ed?
Thanks for da info.. :D

Regards,
mTz




On 7/6/07, Ed Wilts <ewilts AT ewilts DOT org> wrote:

50+ media servers should not be that risky with 6.0 – one of the driving reasons behind the 6.0 release was for scalability, especially with a large number of media servers.

 

Are the pbx processes all running properly?  If they weren't, then I could picture the symptoms you're seeing with ssh/ping working, but NetBackup not working.

 

               …/Ed

--

Ed Wilts, Mounds View, MN, USA

mailto:ewilts AT ewilts DOT org

I GoodSearch for Bundles Of Love:  http://www.goodsearch.com/?charityid=821118

 

From: veritas-bu-bounces AT mailman.eng.auburn DOT edu [mailto:veritas-bu-bounces AT mailman.eng.auburn DOT edu] On Behalf Of Sponsler, Michael
Sent: Friday, July 06, 2007 1:46 AM
To: Martin Ruslan; veritas-bu AT mailman.eng.auburn DOT edu


Subject: Re: [Veritas-bu] Master server can't see media servers

 

Yeah, I know having 50+ media servers is risky....but there really is no other way to properly do it.

 

I'm beginning to think it isn't netbackup.  My /usr/openv directory is mounted via a direct connect fibre on a Sun 9990 raid.  I'm running veritas file system (vxfs) 5.0.  I'm also doing Veritas Volume Replicator (VVR) to another Sun 9990 raid at a DRS site.  Both sites are connected via a large, private pipe...so bandwidth between the two sites isn't an issue.  But I'm seeing this in my /var/adm/messages file:

 

vxio V-5-0-0 disconnecting rlink rlk_<hostname>_bkprvg due to exessive retries.

 

Also...when some stuff hangs, and I go into /usr/openv (or any directory under that) my terminal hangs.  I can ssh back into the box, and I'm okay until I navigate into /usr/openv.  I've done a full fsck of the vxfs file system, it wasn't clean the first time...but has come back clean since.

 

So I may be having vxfs issues.  :-/  Yippie....

 

--

Mike Sponsler

Michael.Sponsler AT ngc DOT com

Northrop Grumman Information Technology

 

 


From: Martin Ruslan [mailto: mit.martin AT gmail DOT com]
Sent: Friday, July 06, 2007 1:49 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Cc: Sponsler, Michael
Subject: Re: [Veritas-bu] Master server can't see media servers

Yeap..
it's odd.. :)
well.. as far as I know, it's too risky if you have that much of media server.
Because they alway communicate each other, and when even one media server couldn't talk, the process will be hung.

Are you already check this:

- on /usr/openv/netbackup/bp.conf, did all the media server registered there, with the "SERVER = media_server_name" (without quotes) ?

yes
for all of the media server too?


check on: /usr/openv/netbackup/bin/bpps -a for the hung processes.
try to kill the hung process with ./kill -9 <Hunged PID process>
Then you'll know which media server had the problems.

Regards,
mTz

On 7/6/07, Sponsler, Michael <Michael.Sponsler AT ngc DOT com> wrote:

- Did all of your media server registerd on your /etc/hosts file at the master server?
yes.  The environment had been working for several months.  No recent patch updates or changes (that I'm aware of).  The master server lost "netbackup communication" with the media servers upon restarting the netbackup daemons.

 

- did the master server name and ip address listed on /etc/hosts at all the media server?

 yes.  I can ping and ssh to all media servers and vice versa

 

- on /usr/openv/netbackup/bp.conf, did all the media server registered there, with the "SERVER = media_server_name" (without quotes) ?

yes

 

- check on the media server:  "./usr/openv/volmgr/bin/vmglob  get_gdbhost"

It gives me the master server's hostname

 

 

It's odd, huh?

 

--

Mike Sponsler

Michael.Sponsler AT ngc DOT com

Northrop Grumman Information Technology

 

 


From: Martin Ruslan [mailto: mit.martin AT gmail DOT com ]
Sent: Friday, July 06, 2007 1:36 AM
To: veritas-bu AT mailman.eng.auburn DOT edu
Cc: Sponsler, Michael
Subject: Re: [Veritas-bu] Master server can't see media servers

- Did all of your media server registerd on your /etc/hosts file at the master server?
- did the master server name and ip address listed on /etc/hosts at all the media server?
- on /usr/openv/netbackup/bp.conf, did all the media server registered there, with the "SERVER = media_server_name" (without quotes) ?
- check on the media server:  "./usr/openv/volmgr/bin/vmglob  get_gdbhost"
  is the result was the master server? if not, run: "./usr/openv/volmgr/bin/vmglob  set_gdbhost master_server_name"

Check it, and give us the result.. :)

Regards,
mTz

On 7/6/07, Sponsler, Michael <Michael.Sponsler AT ngc DOT com > wrote:

Netbackup 6.0 MP4, solaris 10 master server; Netbackup 6.0 MP4, solaris 8 san media servers

Roughly 55 media servers.

 

After I rebooted the netbackup daemons, I had the following issue:

Master server can ping and ssh to all media servers, but Master server cannot communicate to media servers via netbackup.  vmoprcmd command hangs.  Any jobs for media servers that start up come back with "Media server is not active".  Netbackup is running on master server and all media servers.  Tried rebooting the master server with same outcome.  There is no firewall between the master server and any media servers.

 

Something obviously changed before I restarted the Netbackup daemons....anyone have any ideas?

 

--

Mike Sponsler

Northrop Grumman Information Technology


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu