Veritas-bu

[Veritas-bu] MSSQL buffers and stripes - and no failure message on failure.

2009-02-05 15:25:55
Subject: [Veritas-bu] MSSQL buffers and stripes - and no failure message on failure.
From: "David McMullin" <David.McMullin AT CBC-Companies DOT com>
To: "veritas-bu AT mailman.eng.auburn DOT edu" <veritas-bu AT mailman.eng.auburn DOT edu>
Date: Thu, 5 Feb 2009 15:14:34 -0500
We recently moved some MSSQL from server A (16G RAM) to server B (4G RAM), 
scripts call for $ALL.

Of the 15 databases, it tries to start all 15 up, it starts up 8 (and backs up 
successfully), fails 9 - 15, and the parent jobs just hangs, no error code is 
returned to NetBackup. If we lower the buffers from 3 to 2 and stripes from 2 
to 1, it works fine. 

The issues we have are 1) how to calculate buffers and stripes, and 2) why this 
is allowed to lock up and fail with no exit error code.



Here is detail from the log and Symantec support comments:

I think I found the root cause of the backup hanging. I looked through the 
dbclient log and see the following:
---
15:14:18.320 [7976.4920] <16> writeToServer: ERR - send() to server on socket 
failed:
15:14:18.320 [7976.4920] <16> dbc_put: ERR - failed sending data to server
15:14:18.445 [7976.4920] <16> VxBSASendData: ERR - Could not do a bsa_put().
15:14:18.445 [7976.4920] <16> DBthreads::dbclient: ERR - Error in 
VxBSASendData: 1.
---
Above we have a socket failure. This results in failure to update the thread 
which sets up the failure below:
---
15:14:18.445 [7976.4920] <16> CDBbackrec::ProcessVxBSAerror: ERR - Error in 
DBthreads::dbclient: 6.
15:14:18.445 [7976.4920] <1> CDBbackrec::ProcessVxBSAerror:     CONTINUATION: - 
The system cannot find the file specified.
15:14:18.445 [7976.4920] <16> DBthreads::dbclient: ERR - Error in VxBSAEndData: 
6.
15:14:18.445 [7976.4920] <1> DBthreads::dbclient:     CONTINUATION: - The 
handle used to associate this call with a previous VxBSAInit() call is invalid.
---
At this point the application panics. See the entries below:
---
15:14:18.461 [7976.7632] <16> DBthreads::dbclient: ERR - Error in 
CompleteCommand: 0x80770004.
15:14:18.461 [7976.7632] <16> DBthreads::dbclient: ERR - A panic close was 
issued to dbclient #2.
15:14:18.461 [7976.6932] <16> DBthreads::dbclient: ERR - Error in 
CompleteCommand: 0x80770004.
15:14:18.523 [7976.6932] <16> DBthreads::dbclient: ERR - A panic close was 
issued to dbclient #1.
---
I'm not sure you can call this a bug. I suppose the code could be a little more 
robust and have a timeout set for the bsa_put() and/or the VxBSAInit() function 
call.




David McMullin



_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu

<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] MSSQL buffers and stripes - and no failure message on failure., David McMullin <=