I was reminded about the following email and want to thank people for their
help and explain the resolution. The following is a reminder of the issue, the
fun stuff is after that.
> > Date: Sat, 29 Jan 2005 18:20:09 -0600
> >
> > TechNote 274544 provides ideas to reduce the burden on the NBU 5.1 software
> > in large environments. Since our upgrade 9 weeks ago, we have been
> > bouncing NBU almost daily due to hung backups and we are a large 24x7
> > environment and have seen limited improvements after 9 weeks of an open
> > case with Veritas - and then seeing this TechNote. I find it ironic that
> > the re-branding of 5.1 to Enterprise Server and a technote that says not to
> > stress BPSCHED in Enterprise Server environments can occur, so I'd like to
> > see if I'm alone here.
> >
> > We have had several issues upgrading to NBU 5.1 MP1 and now MP2 where we
> > are unable to submit many backups (queued or active) at once. The recent
> > TechNote 274544 fits our account perfectly and I am wondering if any other
> > large NBU shops are experiencing similar issues. I have a hard time
> > believing this TechNote was generated just because of us. Veritas shows no
> > desire to address this other than to wait until release 6.x and I could use
> > some friends that will either state that they have an issue or help me push
> > a fix through.
OK - New Date: Today
We have a site specific BPSCHED binary that will be made available in 5.1 MP3
coming to a theatre near you soon. This should resolve many problems in the
accounts that I have been in contact with. Although BPSCHED hasn't changed
much from 4.5 to 5.1, it changed enough. Here is my understanding of what has
changed which was related to our issues.
NBU 4.5 did Version Checking for the purposes of In-Line Tape Copy (ITC).
Because 4.5 was backwards compatible, they checked to see if there was a 3.4
client which didn't support ITC. It also appears that 5.1 handled directives
as to what was to be backed up differently. So, when you use a directive of
"All Local Hard Drives", NBU did the analysis of what this meant and
interrogated each server sequentially to resolve this before the job would even
appear as a queued job in the Activity Monitor. In other words, there were a
lot of background security checks utilizing resources that you can't see. If
you have any clients with Network Connectivity issues, BPSCHED will wait for
the time out value before querying the next server. Meanwhile, you may hit the
next backup window and continue to backlog BPSCHED with resolution issues that
are transparent to you.
The fix was that NBU 5.1 doesn't need to do this version checking from 5.1 back
to 4.5 for ITC. When they took this out of BPSCHED, we have been able to push
the system with over 2,000 concurrent backup jobs and nothing is getting
delayed. This seems to be extremely susceptible for Windows Servers.
Because of this, we are going to go back to "All Local Drives" this weekend
along with ITC. I have run just over a week without issues with ITC turned
back on. I am anxious to see how the new directive of "All Local Drives"
works. Per my reference to the technote, we have also tried to "not stress"
NBU and broke our policies into multiple policies that will have a convenient
schedule. If this new directive works, my next step will be to get back to
life as normal and submit everything at once and let NBU determine when
resources are available, run my backups, and it will really be good to wake up
and see my backups are done.
So far, this has been a wonderful fix that 2 Veritas back-end Engineers have
been involved in on daily calls for 15 weeks. I must say that Veritas really
gave us the resources to resolve this, but we will have a post-mortem as to
what took so long to get their attention. Coming together on this site and
finding friends helped a lot ... this is a useful tool for communication and
resolution. So, I send my thanks to many people.
We do have a special binary for BPSCHED and they want us to upgrade to MP3
soon, but that does include other improvements. If it ain't broke, don't fix
it. I am very happy where I am. I am gunshy to even apply MP3, 15 weeks was a
long burden for many of us support people.
I wrote this email in hopes that it will help some people that are now 7 weeks
without a life, my friends, and people considering upgrading. My heart feels
that 5.1 MP3 will be solid.
I wish that everyone will find that release that keeps them solid and working
and with family.
Brian
|