Griese, Paul wrote:
>
> NBU 4.5 MP3 on Solaris Master, Solaris 8, Ultra-4, running about 370
> active policies; catalog DB is about 64 GB and hasn't been compressed
> lately. We have many Solaris, NT and VMS clients. A bunch of the
> clients are on a SAN. We save about 4.5 to 5.2 TB a day and we use 4
> L700 robots.
>
> Everything was running great. We went 34 days with uninterrupted
> Netbackup service at the end of the year - a record for us - we did
> not have to do any Netbackup bounces. Now everything has gone to heck.
> We can't go two days without many jobs dying with status code 50,
> always in the early morning hours, and it usually happens
> every morning. Rarely do we have a day free of these status 50 aborted
> jobs. After a few days of this, things deteriorate to the point where
> jobs just hang, can not be killed, and we wind-up having to bounce
> Netbackup. We have tried rescheduling jobs so that there are not so
> many running between midnight and 6AM, but it doesn't seem to help. We
> are actually more busy after 6AM but we don't get this rash of code
> 50s after 6AM. We have added a few more active policies in the past
> month, but we have also purged some data off of some other clients
> which has made their backup jobs run for a shorter period of time.
>
> Veritas has been little help. They have told us three different
> things: 1). Try running two Masters; 2). move your Master to a more
> powerful SUN box; 3). install MP5. They seem to imply that we have
> overloaded our SUN box Master, but the uptime and top commands don't
> show excessive load on CPU or memory.
>
> We are going to try installing MP5. The release notes mention error
> code 50, but it relates to "queued vault job receives a status 50"
> which is not exactly our problem. We run Vault in the afternoon
> and they do not have the status 50 problem. The problem resides with
> our nornal backups, not Vault.
>
> So, has anybody had an experience like this? Did MP5 help? Is an
> Ultra-4 not powerful enough for our environment?
>
>
> Paul Griese
> System Management
> 713-331-6454
>
>
> _____________________________________________________________________________
>
>
> (c) 2003 TeleCheck International, Inc. THIS DOCUMENT, AND ANY ATTACHED
> INFORMATION: 1) IS PROPRIETARY, PRIVILEGED AND CONFIDENTIAL PROPERTY
> OF TELECHECK UNDER APPLICABLE LAW, AND 2) IS INTENDED EXCLUSIVELY FOR
> INTERNAL USE BY TELECHECK EMPLOYEES AND INTENDED RECIPIENTS WITH A
> LEGITIMATE TELECHECK BUSINESS NEED THEREFORE. ITS REPRODUCTION,
> DISSEMINATION, DISTRIBUTION AND/OR DISCLOSURE, EXCEPT TO SUCH
> TELECHECK EMPLOYEES AND INTENDED RECIPIENTS, IS STRICTLY PROHIBITED .
> IF YOU ARE NOT SUCH A TELECHECK EMPLOYEE OR INTENDED RECIPIENT, OR THE
> EMPLOYEE OR AGENT RESPONSIBLE FOR DELIVERING THIS MESSAGE TO THE
> INTENDED RECIPIENT, YOU ARE HEREBY NOTIFIED THAT ANY REPRODUCTION,
> DISSEMINATION, DISTRIBUTION AND/OR DISCLOSURE OF THIS DOCUMENT, OR ANY
> ATTACHMENTS, IS STRICTLY PROHIBITED.
>
Are you using SSO with more than 1 media server having access
to the tape drives? If so you might have to do a Synchronize
Global Device Database on the media servers. Don't know why
it happens, but that seems to fix our '50' problem.
Gregg
--
=================================================================
Gregg MacKinnon Ford Motor Co.
gmackinn AT ford DOT com 2101 Village Rd.
Technical Computing Section Rm 1116, MD 1076
(313) 594-3716 pager 7958343 Dearborn Michigan, 48124
==================================================================
|