Veritas-bu

[Veritas-bu] 41's on Solaris while connecting to localhost

2003-02-05 12:33:41
Subject: [Veritas-bu] 41's on Solaris while connecting to localhost
From: blacksmith AT rogers DOT com (Double Black)
Date: Wed, 5 Feb 2003 12:33:41 -0500
Hi Rafal,

Do you get any fix on the status code 41? can you share me a light if you
have?

I got exact same error as you posted, I am at NBU 45, not patched yet. I
have same issue even when I was at 341_3a, so we decide to upgrade to 45.
but same result. the client I have is V880, the file server with about 200gb
data. I even divide them into 3 classes, same result?

Thanks
db

----- Original Message -----
From: "rafal wiosna" <rafamiga AT uucp.polbox DOT pl>
To: <veritas-bu AT mailman.eng.auburn DOT edu>
Sent: Saturday, December 28, 2002 6:53 AM
Subject: [Veritas-bu] 41's on Solaris while connecting to localhost


>
> I'm getting lots of error 41 on jobs screen lately. Funny, I swear I
> didn't change a thing in the last few days. I'm getting this _ONLY_ when
> accessing backup server which in fact is also a client with 260+ GB slice
of
> our disk array -- we store rsynced version of some hard-to-reach remote
> servers there [or Linux servers with libc/glibc2.0 on which the NB client
> does not run]. The server's Solaris machine and I tried to truss the bpkar
> and bptm processes -- on bpbkar I'm getting read 0's out of file
descriptor 1
> and the bptm process seems like it hung [no truss output]. Note that this
> only happens on this backup-server-client but not while doing remote
client
> jobs.
>
> I had this situation lately and I belive turning off the move
> detection helped to solve this but it's not a remedy, move detection [the
> "bug fix" feature] is rather important to me.
>
> Also I noticed that for some reason one of the server directories I
> backup from has missing TIR info while the others that I backup in the
same
> job has all of them in place. This results in server backuping all the
data
> for inc-cummulative backups for this directory only. I'm not 100% sure if
> the TIR info on this directory disappeared or it was not written from the
> beginnig. Could be result of killing bptm and bpbkar all together to get
rid
> of unstoppable job hanging out there for 4-5 hours with log entries like
> this:
>
> 11:43:21.950 [26234] <2> bpbrm sighandler: signal 14 caught by bpbrm
> 11:43:21.950 [26234] <2> bpbrm sighandler: bpbrm timeout after 300 seconds
> 11:43:21.950 [26234] <2> bpbrm kill_child_process: start
> 11:43:21.951 [26234] <2> bpbrm wait_for_child: start
> 11:44:52.348 [26234] <2> bpbrm wait_for_child: child exit_status = 82
signal_status = 0
> 11:44:52.348 [26234] <2> inform_client_of_status: INF - Server status = 41
> 11:46:21.354 [26234] <2> OpenMailPipe: /usr/ucb/mail
................................
> 11:46:21.364 [26234] <2> OpenMailPipe: Before subject string write
> 11:46:21.365 [26234] <2> OpenMailPipe: After subject string write
> 11:46:21.371 [26234] <2> bpbrm Exit: client backup EXIT STATUS 41: network
connection timed out
>
> I'm curious why bpbrm gets sig 14/timeout -- from watching other
> process or by itself? What child is this log talking about [what processes
> does it fork?].
>
> I _DO_ have NFS volumes on this Solaris 8+8_recommended machine but
> the 41's happen also on VxFS volumes mounted from disk array [checked the
> Faq-O-Matic first].
>
> It all started a 2 days ago, previously I didn't get any 41's while
> doing inc-diff and inc-cumm type backups.
>
> Anyone having expirience with error 41 and bpbkar/bpbrm/bptm behaving
> strange? Is there anything I could check?
>
> NB 4.5GA with NB_45_2 applied.
>
> --
> __________________________________________________________________________
> rafal wiosna * TDC Internet Polska S.A. * Polbox * In ARP we trust * AR164
> RAFD-RIPE * PGP nyckeln finns tillgänglig på www.se.pgp.net (ID: 3CDCB7A9)
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu


<Prev in Thread] Current Thread [Next in Thread>
  • [Veritas-bu] 41's on Solaris while connecting to localhost, Double Black <=