This is a multi-part message in MIME format.
------=_NextPart_000_0012_01C49C88.619B4EB0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MessageHi!
I dont know if my problem is the same as yours, but earlier we also had =
problem with vmd.=20
It hung for no reason. The trick at the time was to kill vmd and start =
it again. But the problem keept coming back.
I`am still not excalty sure what fixed the problem. It seemed to occur =
when someone used Netbackup Remote Administration Console to manage =
medias.
I asked all admins at that place to uninstall ther admin console and =
reinstall the correct console version and patch level. I think it was a =
miss match between som admin console and the master that made this =
problem.
But in the same time i did upgrade hpux 11.00 to 11.11.=20
hpux 11.00 is REALLY bad on handle memory and network traffic, so many =
of our problems went away after the uppgrad to hpux 11.11.
Good Luck!
MVH / Hampus Lind
Rikspolisstyrelsen
Tele arb: +46 (0)8 - 401 99 43
Tele mob: +46 (0)70 - 217 92 66
E-mail: hampus.lind AT rps.police DOT se
----- Original Message -----=20
From: Arif Budiman=20
To: veritas-bu AT mailman.eng.auburn DOT edu=20
Sent: Friday, September 17, 2004 3:54 AM
Subject: [Veritas-bu] Netbackup hung
Does anyone know, what the following log means =
(/usr/openv/volmgr/debug/daemon)? Does it critical ? Because our =
netbackup process seem to be hung, many of them just in mounting state.
vmd: could not set TCP_NODELAY =20
From device monitor I cant' see any device and get error message : =
network protocol error (MM status 39).
Veritas support suggest us to increase client_connect_timeout and =
client_read_timeout variable. But I believe it doesn't solve the =
problem. The problem still happen.
If i cancel active jobs, it doesn't respond. From the bpsched log I =
get :
08:42:46.180 [18814] <2> correct_drive_statuses: =
datamover1-hcart-robot-tld-0_MPX-incomplete mounts=3D2, available =
drives=3D12 aj=3D3 aj_cm=3D1
08:42:46.180 [18814] <2> correct_drive_statuses: =
datamover2_MPX-incomplete mounts=3D2, available drives=3D12 aj=3D2 =
aj_cm=3D0
08:42:46.180 [18814] <2> correct_drive_statuses: =
jktgrhxmedia_MPX-incomplete mounts=3D5, available drives=3D12 aj=3D6 =
aj_cm=3D1
08:42:46.180 [18814] <2> correct_drive_statuses: test_DM1-incomplete =
mounts=3D0, available drives=3D0 aj=3D0 aj_cm=3D0
08:42:46.181 [18814] <2> correct_drive_statuses: =
xl-file02-hcart-robot-tld-0-MPX-incomplete mounts=3D0, available =
drives=3D16 aj=3D0 aj_cm=3D0
08:42:46.181 [18814] <2> correct_drive_statuses: =
xl-library-hcart-robot-tld-0-MPX-incomplete mounts=3D0, available =
drives=3D15 aj=3D0 aj_cm=3D0
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached threshold =3D =
50, invalidate skip count =3D 5
08:42:47.040 [19990] <2> set_job_details: Sending jobData jobid =
(90418)=20
08:42:47.040 [19990] <2> send_structure_data: Index 34 Field =
m_nKilobytes Value <200735040>
08:42:47.040 [19990] <2> send_structure_data: Index 37 Field =
m_nKbPerSec Value <7542>
08:42:47.041 [19990] <2> set_job_details: Sending jobRunData jobid =
(90418)=20
08:42:47.041 [19990] <2> send_structure_data: Index 47 Field =
m_nCompletion Value <12>
08:42:47.041 [19990] <8> read_bpbrm_stderr: WROTE xl-file01_1095393040 =
50048 0 7542.663 0
08:42:49.040 [19990] <8> read_bpbrm_stderr: CURRENT POSITION STK724 =
1135 0
08:42:50.680 [8405] <2> salarm: got signal 14
08:42:51.490 [8352] <2> salarm: got signal 14
08:42:53.500 [5148] <2> salarm: got signal 14
08:43:02.040 [19990] <2> set_job_details: Sending jobData jobid =
(90418)=20
08:43:02.040 [19990] <2> send_structure_data: Index 35 Field m_nFiles =
Value <134000>
08:43:02.041 [19990] <2> set_job_details: Sending jobRunData jobid =
(90418)=20
08:43:02.041 [19990] <2> send_structure_data: Index 46 Field =
m_szPathname Value </M/Directorat/NetworkOperation/Network =
Assurance/5-Monitoring Financial/CER Files/BUDGET DIST 08.01.04.xls>
08:43:02.041 [19990] <8> read_bpbrm_stderr: ADDED FILES TO DB FOR =
xl-file01_1095393040 500 /M/Directorat/NetworkOperation/Network =
Assurance/5-Monitoring Financial/CER Files/BUDGET DIST 08.01.04.xls
08:43:06.040 [19990] <8> read_bpbrm_stderr: WROTE xl-file01_1095393040 =
50048 0 7539.643 0
08:43:09.040 [19990] <8> read_bpbrm_stderr: CURRENT POSITION STK724 =
1136 0
Does anyone have some suggestion how to solve such a bugging =
problem???
Regards,
Arif Budiman
PT Excelcomindo Pratama
------=_NextPart_000_0012_01C49C88.619B4EB0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD><TITLE>Message</TITLE>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHTML 6.00.2800.1458" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT face=3DArial size=3D2>Hi!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>I dont know if my problem is the same =
as yours, but=20
earlier we also had problem with vmd. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2>It hung for no reason. The trick at the =
time was to=20
kill vmd and start it again. But the problem keept coming =
back.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>I`am still not excalty sure what fixed =
the problem.=20
It seemed to occur when someone used Netbackup Remote Administration =
Console to=20
manage medias.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2>I asked all admins at that place to =
uninstall ther=20
admin console and reinstall the correct console version and patch level. =
I think=20
it was a miss match between som admin console and the master that made =
this=20
problem.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>But in the same time i did upgrade hpux =
11.00 to=20
11.11. </FONT></DIV>
<DIV><FONT face=3DArial size=3D2>hpux 11.00 is REALLY bad on handle =
memory and=20
network traffic, so many of our problems went away after the uppgrad to =
hpux=20
11.11.</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><FONT face=3DArial size=3D2>Good Luck!</FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV>MVH / Hampus Lind<BR>Rikspolisstyrelsen<BR>Tele arb: +46 (0)8 - 401 =
99=20
43<BR>Tele mob: +46 (0)70 - 217 92 66<BR>E-mail: <A=20
href=3D"mailto:hampus.lind AT rps.police DOT se">hampus.lind AT rps.police DOT
se</A><B=
R></DIV>
<BLOCKQUOTE dir=3Dltr=20
style=3D"PADDING-RIGHT: 0px; PADDING-LEFT: 5px; MARGIN-LEFT: 5px; =
BORDER-LEFT: #000000 2px solid; MARGIN-RIGHT: 0px">
<DIV style=3D"FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV=20
style=3D"BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: =
black"><B>From:</B>=20
<A title=3DArifB AT xl.co DOT id href=3D"mailto:ArifB AT xl.co DOT id">Arif =
Budiman</A> </DIV>
<DIV style=3D"FONT: 10pt arial"><B>To:</B> <A=20
title=3Dveritas-bu AT mailman.eng.auburn DOT edu=20
=
href=3D"mailto:veritas-bu AT mailman.eng.auburn DOT edu">veritas-bu AT mailman
DOT eng.=
auburn.edu</A>=20
</DIV>
<DIV style=3D"FONT: 10pt arial"><B>Sent:</B> Friday, September 17, =
2004 3:54=20
AM</DIV>
<DIV style=3D"FONT: 10pt arial"><B>Subject:</B> [Veritas-bu] Netbackup =
hung</DIV>
<DIV><BR></DIV>
<DIV><FONT face=3DArial size=3D2><SPAN class=3D041002501-17092004>Does =
anyone know,=20
what the following log means (/usr/openv/volmgr/debug/daemon)? Does it =
critical ? Because our netbackup process seem to be hung, =
many of=20
them just in mounting state.</SPAN></FONT></DIV>
<DIV><FONT face=3D"Arial Unicode MS" size=3D2></FONT> </DIV>
<DIV><FONT size=3D+0><FONT face=3D"Courier New"><FONT size=3D2>vmd: =
could not set=20
TCP_NODELAY<SPAN class=3D041002501-17092004> =20
</SPAN></FONT></FONT></FONT></DIV>
<DIV><FONT face=3DArial size=3D2></FONT> </DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial size=3D2>From =
device=20
monitor I cant' see any device and get error message : <FONT=20
face=3D"Courier New">network protocol error (MM status=20
39).</FONT></FONT></SPAN></DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial =
size=3D2>Veritas support=20
suggest us to=20
=
increase client_connect_timeout and client_read_timeout&nb=
sp;variable.=20
But I believe it doesn't solve the problem. The problem still=20
happen.</FONT></SPAN></DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial size=3D2>If =
i cancel=20
active jobs, it doesn't respond. From the bpsched log I get=20
:</FONT></SPAN></DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial><FONT =
size=3D2><FONT=20
face=3D"Courier New">08:42:46.180 [18814] <2> =
correct_drive_statuses:=20
datamover1-hcart-robot-tld-0_MPX-incomplete mounts=3D2, available =
drives=3D12 aj=3D3=20
aj_cm=3D1<BR>08:42:46.180 [18814] <2> correct_drive_statuses:=20
datamover2_MPX-incomplete mounts=3D2, available drives=3D12 aj=3D2=20
aj_cm=3D0<BR>08:42:46.180 [18814] <2> correct_drive_statuses:=20
jktgrhxmedia_MPX-incomplete mounts=3D5, available drives=3D12 aj=3D6=20
aj_cm=3D1<BR>08:42:46.180 [18814] <2> correct_drive_statuses:=20
test_DM1-incomplete mounts=3D0, available drives=3D0 aj=3D0 =
aj_cm=3D0<BR>08:42:46.181=20
[18814] <2> correct_drive_statuses:=20
xl-file02-hcart-robot-tld-0-MPX-incomplete mounts=3D0, available =
drives=3D16 aj=3D0=20
aj_cm=3D0<BR>08:42:46.181 [18814] <2> correct_drive_statuses:=20
xl-library-hcart-robot-tld-0-MPX-incomplete mounts=3D0, available =
drives=3D15 aj=3D0=20
aj_cm=3D0<BR>08:42:46.181 [18814] <2> invalidate_a_m_c_entry: =
cached=20
threshold =3D 50, invalidate skip count =3D 5<BR>08:42:46.181 [18814] =
<2>=20
invalidate_a_m_c_entry: cached threshold =3D 50, invalidate skip count =
=3D=20
5<BR>08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached =
threshold =3D=20
50, invalidate skip count =3D 5<BR>08:42:46.181 [18814] <2>=20
invalidate_a_m_c_entry: cached threshold =3D 50, invalidate skip count =
=3D=20
5<BR>08:42:46.181 [18814] <2> invalidate_a_m_c_entry: cached =
threshold =3D=20
50, invalidate skip count =3D 5<BR>08:42:46.181 [18814] <2>=20
invalidate_a_m_c_entry: cached threshold =3D 50, invalidate skip count =
=3D=20
5<BR>08:42:47.040 [19990] <2> set_job_details: Sending jobData =
jobid=20
(90418) <BR>08:42:47.040 [19990] <2> send_structure_data: Index =
34 Field=20
m_nKilobytes Value <200735040><BR>08:42:47.040 [19990] <2> =
send_structure_data: Index 37 Field m_nKbPerSec Value=20
<7542><BR>08:42:47.041 [19990] <2> set_job_details: =
Sending=20
jobRunData jobid (90418) <BR>08:42:47.041 [19990] <2>=20
send_structure_data: Index 47 Field m_nCompletion Value=20
<12><BR>08:42:47.041 [19990] <8> read_bpbrm_stderr: WROTE=20
xl-file01_1095393040 50048 0 7542.663 0<BR>08:42:49.040 [19990] =
<8>=20
read_bpbrm_stderr: CURRENT POSITION STK724 1135 0<BR>08:42:50.680 =
[8405]=20
<2> salarm: got signal 14<BR>08:42:51.490 [8352] <2> =
salarm: got=20
signal 14<BR>08:42:53.500 [5148] <2> salarm: got signal=20
14<BR>08:43:02.040 [19990] <2> set_job_details: Sending jobData =
jobid=20
(90418) <BR>08:43:02.040 [19990] <2> send_structure_data: Index =
35 Field=20
m_nFiles Value <134000><BR>08:43:02.041 [19990] <2>=20
set_job_details: Sending jobRunData jobid (90418) <BR>08:43:02.041 =
[19990]=20
<2> send_structure_data: Index 46 Field m_szPathname Value=20
</M/Directorat/NetworkOperation/Network Assurance/5-Monitoring=20
Financial/CER Files/BUDGET DIST 08.01.04.xls><BR>08:43:02.041 =
[19990]=20
<8> read_bpbrm_stderr: ADDED FILES TO DB FOR =
xl-file01_1095393040 500=20
/M/Directorat/NetworkOperation/Network Assurance/5-Monitoring =
Financial/CER=20
Files/BUDGET DIST 08.01.04.xls<BR>08:43:06.040 [19990] <8>=20
read_bpbrm_stderr: WROTE xl-file01_1095393040 50048 0 7539.643=20
0<BR>08:43:09.040 [19990] <8> read_bpbrm_stderr: CURRENT =
POSITION STK724=20
1136 0</FONT><BR></FONT></FONT></SPAN></DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial=20
size=3D2></FONT></SPAN> </DIV>
<DIV><SPAN class=3D041002501-17092004><FONT face=3DArial size=3D2>Does =
anyone have=20
some suggestion how to solve such a bugging =
problem???</FONT></SPAN></DIV><!-- Converted from text/plain format -->
<P align=3Dleft><FONT face=3D"Arial Unicode MS" color=3D#8080ff=20
size=3D2>Regards,<BR>Arif Budiman<BR><SPAN =
class=3D041002501-17092004>PT=20
Excelcomindo Pratama</SPAN></FONT></P>
<DIV><FONT face=3DArial =
size=3D2></FONT> </DIV></BLOCKQUOTE></BODY></HTML>
------=_NextPart_000_0012_01C49C88.619B4EB0--
|