Veritas-bu

Re: [Veritas-bu] same job keeps hanging.

2007-07-09 17:24:46
Subject: Re: [Veritas-bu] same job keeps hanging.
From: David Rock <dave-bu AT graniteweb DOT com>
To: veritas-bu AT mailman.eng.auburn DOT edu
Date: Mon, 9 Jul 2007 16:12:09 -0500
* Aaron Mills <aaron.mills AT returnpath DOT net> [2007-07-09 16:39]:
> Hi all,
> 
> I'm hoping someone's seen this before. I'm running 5.1MP6 w/ AIT3 - I've
> got a ~126GB backup that kicks off weekly, but hangs within a few hours
> every time - the error I get is always "media manager terminated by
> parent process" but the logs don't seem to show anything odd. No other
> backups hang like this. This is also the only job that runs on the
> server itself.

When you say "runs on the server itself", what do you actually mean?  We
say an odd timeout that always happened at the same time into the
backup, but the specific circumstances were:

1. a bpbackup command running on a client system
2. client on the other side of a firewall

What was happening in our case was the backup would start, one hour into
the backup, the firewall would decide since it didn't see any traffic
coming from the client to the master server, it would drop the entry in
the state table.  Then, one hour later, the client would try to send a
keepalive packet through the now-defunct connection, fail, retry several
times, and then finally give up and die, taking the backup with it.

This may not be anything like what you are dealing with, but it is a
pretty good example of how things other than NBU can cause weird things
to happen and make it look like NBU is the cause.  Does your job always
die at the same time, or does it vary from attempt to attempt?

-- 
David Rock
david AT graniteweb DOT com
_______________________________________________
Veritas-bu maillist  -  Veritas-bu AT mailman.eng.auburn DOT edu
http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu