Veritas-bu

[Veritas-bu] status code 50

2004-01-21 10:20:44
Subject: [Veritas-bu] status code 50
From: Paul.Griese AT TeleCheck DOT com (Griese, Paul)
Date: Wed, 21 Jan 2004 09:20:44 -0600
This message is in MIME format. Since your mail reader does not understand
this format, some or all of this message may not be legible.

------_=_NextPart_001_01C3E032.23837912
Content-Type: text/plain

 
NBU 4.5 MP3 on Solaris Master, Solaris 8, Ultra-4, running about 370 active
policies; catalog DB is about 64 GB and hasn't been compressed lately. We
have many Solaris, NT and VMS clients. A bunch of the clients are on a SAN.
We save about 4.5 to 5.2 TB a day and we use 4 L700 robots.
 
Everything was running great. We went 34 days with uninterrupted Netbackup
service at the end of the year - a record for us - we did not have to do any
Netbackup bounces. Now everything has gone to heck. We can't go two days
without many jobs dying with status code 50, always in the early morning
hours, and it usually happens every morning. Rarely do we have a day free of
these status 50 aborted jobs. After a few days of this, things deteriorate
to the point where jobs just hang, can not be killed, and we wind-up having
to bounce Netbackup.  We have tried rescheduling jobs so that there are not
so many running between midnight and 6AM, but it doesn't seem to help. We
are actually more busy after 6AM but we don't get this rash of code 50s
after 6AM. We have added a few more active policies in the past month, but
we have also purged some data off of some other clients which has made their
backup jobs run for a shorter period of time.  
 
Veritas has been little help. They have told us three different things: 1).
Try running two Masters; 2). move your Master to a more powerful SUN box;
3). install MP5. They seem to imply that we have overloaded our SUN box
Master, but the uptime and top commands don't show excessive load on CPU or
memory.
 
We are going to try installing MP5. The release notes mention error code 50,
but it relates to "queued vault job receives a status 50" which is not
exactly our problem. We run Vault in the afternoon and they do not have the
status 50 problem. The problem resides with our nornal backups, not Vault.
 
So, has anybody had an experience like this? Did MP5 help? Is an Ultra-4 not
powerful enough for our environment?
 
 
Paul Griese
System Management
713-331-6454
 
____________________________________________________________________________
_

(c) 2003 TeleCheck International, Inc. THIS DOCUMENT, AND ANY ATTACHED
INFORMATION: 1) IS PROPRIETARY, PRIVILEGED AND CONFIDENTIAL PROPERTY OF
TELECHECK UNDER APPLICABLE LAW, AND 2)  IS INTENDED EXCLUSIVELY FOR INTERNAL
USE BY TELECHECK EMPLOYEES AND INTENDED RECIPIENTS WITH A LEGITIMATE
TELECHECK BUSINESS NEED THEREFORE.  ITS REPRODUCTION, DISSEMINATION,
DISTRIBUTION AND/OR  DISCLOSURE, EXCEPT TO SUCH TELECHECK EMPLOYEES AND
INTENDED RECIPIENTS,  IS STRICTLY PROHIBITED .  IF YOU ARE NOT SUCH A
TELECHECK EMPLOYEE OR INTENDED RECIPIENT, OR THE EMPLOYEE OR AGENT
RESPONSIBLE FOR DELIVERING THIS MESSAGE TO THE INTENDED RECIPIENT, YOU ARE
HEREBY NOTIFIED THAT ANY REPRODUCTION, DISSEMINATION, DISTRIBUTION AND/OR
DISCLOSURE OF THIS DOCUMENT, OR ANY ATTACHMENTS, IS STRICTLY PROHIBITED.




------_=_NextPart_001_01C3E032.23837912
Content-Type: text/html
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META HTTP-EQUIV=3D"Content-Type" CONTENT=3D"text/html; =
charset=3DUS-ASCII">
<TITLE>Message</TITLE>

<META content=3D"MSHTML 6.00.2800.1276" name=3DGENERATOR></HEAD>
<BODY>
<DIV><FONT face=3DArial size=3D2></FONT>&nbsp;</DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial size=3D2>NBU =
4.5 MP3 on=20
Solaris Master, Solaris 8, Ultra-4, running about 370 active policies; =
catalog=20
DB is about&nbsp;64 GB and hasn't been compressed lately. We have many =
Solaris,=20
NT and VMS clients. A bunch of the clients are on a SAN. We save about =
4.5 to=20
5.2 TB a day and we use 4 L700 robots.</FONT></SPAN></DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial =
size=3D2>Everything was=20
running great. We went 34 days with uninterrupted Netbackup service at =
the end=20
of the year - a record for us -&nbsp;we did not have to do any =
Netbackup=20
bounces. Now everything has gone to heck. We can't go&nbsp;two days =
without many=20
jobs dying with&nbsp;status code 50, always in the early morning hours, =

and&nbsp;it usually&nbsp;happens every&nbsp;morning.&nbsp;Rarely do we =
have a=20
day free of these status 50 aborted jobs. After a few days of this, =
things=20
deteriorate to the point where jobs just hang, can not be killed, and =
we wind-up=20
having to bounce Netbackup.&nbsp; We have tried rescheduling jobs so =
that there=20
are not so many running between midnight and 6AM, but it doesn't seem =
to help.=20
We are actually more busy after 6AM but we don't get this rash of code =
50s after=20
6AM. We have added a few more active policies in the past month, but we =
have=20
also purged some data off of some other clients which has made their =
backup jobs=20
run for a shorter period of time.&nbsp;&nbsp;</FONT></SPAN></DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial =
size=3D2>Veritas has been=20
little help. They have&nbsp;told us three different =
things:</FONT>&nbsp;<FONT=20
face=3DArial size=3D2>1). Try running two Masters; 2). move your Master =
to a more=20
powerful SUN box; 3). install MP5. They seem to imply that we have =
overloaded=20
our SUN box Master, but the uptime and top commands don't show =
excessive load on=20
CPU or memory.</FONT></SPAN></DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial size=3D2>We =
are going to try=20
installing MP5. The release notes mention error code 50, but it relates =
to=20
"queued vault job receives a status 50" which is not exactly our =
problem. We run=20
Vault in the afternoon and&nbsp;they&nbsp;do not have the status 50 =
problem. The=20
problem resides with our nornal backups, not Vault.</FONT></SPAN></DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial size=3D2>So, =
has anybody had=20
an experience like this? Did MP5 help? Is an Ultra-4 not powerful =
enough for our=20
environment?</FONT></SPAN></DIV>
<DIV><SPAN class=3D439172814-21012004><FONT face=3DArial=20
size=3D2></FONT></SPAN>&nbsp;</DIV>
<DIV>&nbsp;</DIV>
<DIV align=3Dleft><FONT face=3DArial size=3D2>Paul Griese</FONT></DIV>
<DIV align=3Dleft><FONT face=3DArial size=3D2>System =
Management</FONT></DIV>
<DIV align=3Dleft><FONT face=3DArial size=3D2>713-331-6454</FONT></DIV>
<DIV>&nbsp;</DIV></BODY></HTML>

<P><FONT SIZE=3D2 =
FACE=3D"Arial">_________________________________________________________=
____________________</FONT></P>
<BR>

<P><FONT SIZE=3D2 FACE=3D"Arial">(c) 2003 TeleCheck International, Inc. =
THIS DOCUMENT, AND ANY ATTACHED INFORMATION: 1) IS PROPRIETARY, =
PRIVILEGED AND CONFIDENTIAL PROPERTY OF TELECHECK UNDER APPLICABLE LAW, =
AND 2)  IS INTENDED EXCLUSIVELY FOR INTERNAL USE BY TELECHECK EMPLOYEES =
AND INTENDED RECIPIENTS WITH A LEGITIMATE TELECHECK BUSINESS NEED =
THEREFORE.  ITS REPRODUCTION, DISSEMINATION, DISTRIBUTION AND/OR  =
DISCLOSURE, EXCEPT TO SUCH TELECHECK EMPLOYEES AND INTENDED RECIPIENTS, =
 IS STRICTLY PROHIBITED .  IF YOU ARE NOT SUCH A TELECHECK EMPLOYEE OR =
INTENDED RECIPIENT, OR THE EMPLOYEE OR AGENT RESPONSIBLE FOR DELIVERING =
THIS MESSAGE TO THE INTENDED RECIPIENT, YOU ARE HEREBY NOTIFIED THAT =
ANY REPRODUCTION, DISSEMINATION, DISTRIBUTION AND/OR DISCLOSURE OF THIS =
DOCUMENT, OR ANY ATTACHMENTS, IS STRICTLY PROHIBITED.</FONT></P>
<BR>
<BR>
<BR>

------_=_NextPart_001_01C3E032.23837912--

<Prev in Thread] Current Thread [Next in Thread>