Just a quick update on our status. Symantec tech
support eventually told me this was an OS issue so I engaged Microsoft
support. In the process of transferring knowledge (and log files) to
Microsoft, thinking this was an OS issue the Microsoft Technician noticed that
VNETd connections were both failing on successful and failed database backups.
This information lead the Symantec engineer to suggest we move the database
servers off the reserved port range and into the non reserved port range
space. I applied the following to seven database servers that were failing
nightly. None of them failed last night / post fix.
Add each client to the Master Server's Client Attributes
with the following settings:
BPCD connect back: Random Port
Ports: Non-reserved port
Daemon connection port: Daemon port
only
This seems to have done the trick, but I don't quite
understand why. The best I can figures is that after the database client
requests a user backup from the master, the master attempts to "talk back" to
the client via the reserved port range, and we're running out of reserved
ports. This answer seems a little ridiculous to me so I'm probing for more
information from Symantec before I close this ticket.
Does anyone have a better explanation on how Netbackup's
ports / access work? Some critical questions in my mind still need to be
answered such as, why did this only affect database servers? Can I increase the
reserved port range and get around having to configure / use the non-reserved
port range? Additionally, I got an error 21 on a DSSU duplication job this
morning, I wonder if that could be related.
At least the fix above gets us running backups again and
hopefully this helps someone else in the future. I'll be taking the
Symantec Port Guide home tonight for some casual reading.
-Jonathan
Environment:
Master
is Windows 2003 SP2 x86 running NBU 6.5.3 and clients are a mix of Windows 2000,
2003, x86, x64 running NBU 5.1 MP4 & 5.
I've
got the oddest issue I've been working two weeks. Two weekends ago my
RMAN and MS-SQL backups began failing with errors 21. The jobs don't fail
right away, and not every job fails. An RMAN job might write none or all
but one of its datasets than fail with Status 21 - Socket Open Error.
Re-runs are generally successful. I'd say about 30% of all RMAN and MS-SQL
backups are failing with this nightly, and its not always the same database or
dataset. I moved the jobs from a media server to the master, and the issue
still occurs leading me to believe this is a master server issue. I've had
a case open with Symantec all week and they are definitely seeing errors when
the Master goes to open communications with the client to start the backup, but
cannot determine why (so far.)
Has
anyone else ever run into this / have any ideas? I'm neck deep in Windows
registry at this point trying to figure something out but I've got no joy so
far. I've goggled and checked Symantec's site but this doesn't seem to be
a common issue.
-Jonathan
_______________________________________________
Veritas-bu maillist - Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
|