Veritas-bu

[Veritas-bu] Serious master issue...

2007-02-15 08:53:42
Subject: [Veritas-bu] Serious master issue...
From: ssesar at mitre.org (Steven L. Sesar)
Date: Thu, 15 Feb 2007 08:53:42 -0500
Yes it does, via bpimmedia.



Hampus Lind wrote:
>
> The NBCC doesn?t look at the image db, and they keep saying we have a 
> problem there.. But I don?t know how we can fix it or even collect the 
> info from the db when bpdbm ?consistensy 2 wont runt..
>
> Hampus Lind
> Rikspolisstyrelsen
> National Police Board
> Tel dir: +46 (0)8 - 401 99 43
> Tel mob: +46 (0)70 - 217 92 66
> E-mail: hampus.lind at rps.police.se <mailto:hampus.lind at rps.police.se>
>
> -----Ursprungligt meddelande-----
> *Fr?n:* Steven L. Sesar [mailto:ssesar at mitre.org]
> *Skickat:* den 14 februari 2007 20:53
> *Till:* Hampus Lind
> *Kopia:* 'Justin Piszcz'; 'Bahnmiller, Bryan'; 
> Veritas-bu at mailman.eng.auburn.edu
> *?mne:* Re: [Veritas-bu] Serious master issue...
>
> bpdbm -consistency 2 is useless to you, based on the amount of data 
> that you back up nightly and my own presumption of how long backups 
> run in your environment. It will take longer to run than your backup 
> domain will remain idle. If I recall, they have a process which does a 
> better job at finding catalog/db corruption/inconsistency. I think 
> that it's called NBCC.
>
> The problem with NBCC is similar, though. You send them the output of 
> three commands:
>
> vmquery -a, bpmedialist -ls, and bpimmedia
>
> Then, they munge the output of the above commands through a reporting 
> tool that Symantec will NOT share with end users. At some point later 
> in the day (hopefully, sooner rather than later), they will send you a 
> report. You must then take certain actions to correct any 
> discrepancies found. The backup system must be completely idle during 
> this time. Restores are ok, but no backup activity can be taking place.
>
> Afterwards, you 'll run those commands again, they'll generate the 
> report again, and you'll see how you're doing. It may take you several 
> passes to get things squared away.
>
> The problem is that most of us don't have a completely idle backup 
> infrastructure - at least for long enough for this process to 
> complete. I didn't when I was NBU customer. Once you take backups, the 
> reports become obsolete, as do the results of bpdbm -consistency 2.
>
> It would not surprise me if bpdbm was leaking memory on your platform.
>
> --Steve
>
>
> Hampus Lind wrote:
>
> Hi,
>  
> I cant don anything....
>  
> Bpdbm -consistecny 2 has been running for over 12 hours and havent checked
> more than 4-5 clients.
>  
> It was the first thing support told me. Your db is corrupted... So I tried
> to run bpdbm -consistency 2 check. The check found some issues, like expired
> images which where not removed etc. But when I was about to remove them
> manually the netbackup db clean process already had took care of them..
>  
> So what I understand you can have some level of corruption in your db which
> nbu cleans out when the clean job runs.
>  
> I am not compressing my catalogs.
>  
> Thanks,
>  
> Hampus Lind
> Rikspolisstyrelsen
> National Police Board
> Tel dir: +46 (0)8 - 401 99 43
> Tel mob: +46 (0)70 - 217 92 66
> E-mail: hampus.lind at rps.police.se <mailto:hampus.lind at rps.police.se>
>  
>  
> -----Ursprungligt meddelande-----
> Fr?n: Justin Piszcz [mailto:jpiszcz at lucidpixels.com] 
> Skickat: den 14 februari 2007 20:31
> Till: Hampus Lind
> Kopia: 'Bahnmiller, Bryan'; Veritas-bu at mailman.eng.auburn.edu 
> <mailto:Veritas-bu at mailman.eng.auburn.edu>
> ?mne: Re: [Veritas-bu] Serious master issue...
>  
> Have you run the check_db_consistency? There is a command that checks to 
> make sure your images are not corrupted!
>  
> I would recommend checking that.
>  
> Also, are you running compression on your catalogs?
>  
>  
> On Wed, 14 Feb 2007, Hampus Lind wrote:
>  
>   
>> Thanks Bryan,
>>  
>>  
>>  
>> It happens directly after reboot..
>>  
>>  
>>  
>> The thing is:
>>  
>> -          I have deactivated all polices
>>  
>> -          Stop our media server
>>  
>> -          And then restarted netbackup on the master.
>>  
>>  
>>  
>> So there are absolutely no action going on (no backup, no user backup, no
>> restore, no staging) only internal netbackup work?.
>>  
>> At once when netbackup on the master gets active, it starts bpdbm process
>> after bpdbm process. It consume 100% of both my CPU`s and write/read
>>     
> heavily
>   
>> to the /usr/openv/netbackup/db filesystem.
>>  
>> When I have no action at all after a clean start, we have about 42 bpdbm
>> processes and nearly as many bprd processes?
>>  
>>  
>>  
>> I cant figure this one out, and support points to disk config or something
>> else that sounds good in there ears?
>>  
>>  
>>  
>> Thanks for all help,
>>  
>>  
>>  
>> Hampus Lind
>> Rikspolisstyrelsen
>> National Police Board
>> Tel dir: +46 (0)8 - 401 99 43
>> Tel mob: +46 (0)70 - 217 92 66
>> E-mail: hampus.lind at rps.police.se <mailto:hampus.lind at rps.police.se>
>>  
>> -----Ursprungligt meddelande-----
>> Fr?n: Bahnmiller, Bryan [mailto:BBahnmiller at pier1.com]
>> Skickat: den 14 februari 2007 20:04
>> Till: Hampus Lind
>> ?mne: RE: [Veritas-bu] Serious master issue...
>>  
>>  
>>  
>> Hampus,
>>  
>>  
>>  
>>  How quickly does this behaviour start happening after a recycle/reboot? I
>> worked with an N4000 master running 11i. We did have 8 cpus and 8 GB RAM.
>>     
> We
>   
>> were running over 15,000 backup jobs daily though. Our catalog was over
>> 400GB. (Catalog was on EMC DMX disk.) Running good old 3.4 we would have
>>     
> to
>   
>> reboot the system almost every week. If you can cleanly re-cycle NetBackup
>>     
> -
>   
>> shut it down, kill all NBU processes, and then restart it, that should be
>> almost as good.
>>  
>>  
>>  
>>  Here we are running NBU 5.1mp4 on a Win2K3 master - 2 cpus, 4 GB RAM. (I
>> inherited the system - not my choice.) We run about 5000 jobs per day, we
>> have a 280 GB catalog on EMC Clariion. The system will stay stable for 2
>> weeks pretty easily. 4 weeks starts pushing things. So we usually reboot
>>     
> our
>   
>> Windows master and media servers every 2 weeks.
>>  
>>  
>>  
>>  It seems like you will have cumulative problems with NetBackup that can
>> build up over time. It is way more pronounced on busy systems. We have
>> another NetBackup system that has 1 Master and 1 Media server. It runs
>>     
> about
>   
>> 40 jobs per day max. I hardly ever have to reboot those servers.
>>  
>>  
>>  
>>       Bryan
>>  
>>  
>>  
>> Bryan Bahnmiller
>>  
>> ISD Business Continuity
>>  
>> Pier 1 Imports, Inc
>>  
>> 817-252-8570
>>  
>>  
>>  
>>  
>>  
>>  
>>  _____
>>  
>>  
>> From: veritas-bu-bounces at mailman.eng.auburn.edu 
>> <mailto:veritas-bu-bounces at mailman.eng.auburn.edu>
>> [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] On Behalf Of Hampus
>>     
> Lind
>   
>> Sent: Wednesday, February 14, 2007 12:17 PM
>> To: Veritas-bu at mailman.eng.auburn.edu <mailto:Veritas-bu at 
>> mailman.eng.auburn.edu>
>> Subject: Re: [Veritas-bu] Serious master issue...
>> Importance: High
>>  
>> All,
>>  
>>  
>>  
>> Now I have been transferred to USA support? God bless America!
>>  
>>  
>>  
>> They have told me that they haven?t seen such a big installation in over a
>> year?. Strange, I have about 200 clients and backup a couple a TB per
>>     
> day..
>   
>> I was under the impression that this was kinda small installation..??
>>  
>>  
>>  
>> However, they have told me that this is perfectly normal behaviour with
>> netbackup. That it produces heavy disk IO and eat all CPU power. And I was
>> really stupid and told them that I also had an case with HP earlier on
>>     
> this
>   
>> disk IO problem, so now Symantec support are pointing all there fingers at
>> HP and our disk setup.
>>  
>>  
>>  
>> Our DB is about 60-65 GB and resides on a StorageTek Flexline 380 disk
>>     
> array
>   
>> (SAN). We run a RAID 5 on 146GB FC drives.. I don?t really see the
>> bottleneck there, but I will create a RAID 5 on 73GB 15K FC drives just to
>> shut netbackup support up?
>>  
>>  
>>  
>> We run a two CPU HP rp2470  with HP-UX 11.11 as a master server. Shouldn?t
>> this be enough for this installation?
>>  
>>  
>>  
>> Ooh well?
>>  
>>  
>>  
>> If support cant help me, what should I do?? I am desperate!!!
>>  
>>  
>>  
>>  
>>  
>> Hampus Lind
>> Rikspolisstyrelsen
>> National Police Board
>> Tel dir: +46 (0)8 - 401 99 43
>> Tel mob: +46 (0)70 - 217 92 66
>> E-mail: hampus.lind at rps.police.se <mailto:hampus.lind at rps.police.se>
>>  
>> -----Ursprungligt meddelande-----
>> Fr?n: veritas-bu-bounces at mailman.eng.auburn.edu 
>> <mailto:veritas-bu-bounces at mailman.eng.auburn.edu>
>> [mailto:veritas-bu-bounces at mailman.eng.auburn.edu] F?r Hampus Lind
>> Skickat: den 14 februari 2007 12:48
>> Till: Veritas-bu at mailman.eng.auburn.edu <mailto:Veritas-bu at 
>> mailman.eng.auburn.edu>
>> ?mne: [Veritas-bu] Serious master issue...
>> Prioritet: H?g
>>  
>>  
>>  
>> Hi,
>>  
>>  
>>  
>> We have a serious issue here with our master server. The problem occurred
>>     
> a
>   
>> couple of weeks ago, or at least I found out about it then..
>>  
>>  
>>  
>> I was looking at IO`s and scsi queue depth on my master (hp-ux 11.11) when
>>     
> a
>   
>> say that we had 4000-6000 SCSI commands in que, and a disk utilisation of
>> 100% for the /usr/openv/netbackup/db disk.
>>  
>>  
>>  
>> I have patched hpux to the latest patch bundle and we run NBU 5.1 MP4.
>>  
>>  
>>  
>> HP support sad that bpdbm was leaking memory.
>>  
>>  
>>  
>> Veritas support still investigating.. But we have about 30 bpdbm and bprd
>> processes active on our master which eats both my CPU`s and produces tons
>>     
> of
>   
>> IO against our db disk.
>>  
>>  
>>  
>> I actived verbose = 5 on the master, and after 15 minutes the bpdbm log
>>     
> had
>   
>> reached the file size limit on our filsystem, 2 GB?
>>  
>>  
>>  
>> Any one had similar problems?
>>  
>>  
>>  
>>  
>>  
>> Thanks and regards,
>>  
>>  
>>  
>> Hampus Lind
>> Rikspolisstyrelsen
>> National Police Board
>> Tel dir: +46 (0)8 - 401 99 43
>> Tel mob: +46 (0)70 - 217 92 66
>> E-mail:  <mailto:hampus.lind at rps.police.se> hampus.lind at rps.police.se 
>> <mailto:hampus.lind at rps.police.se>
>>  
>>  
>>  
>>  
>>     
>  
> _______________________________________________
> Veritas-bu maillist  -  Veritas-bu at mailman.eng.auburn.edu 
> <mailto:Veritas-bu at mailman.eng.auburn.edu>
> http://mailman.eng.auburn.edu/mailman/listinfo/veritas-bu
>   
>
>
>
>
> -- 
> ===================================
>  
>    Steven L. Sesar
>    Lead Operating Systems Programmer/Analyst
>    UNIX Application Services R101
>    The MITRE Corporation
>    202 Burlington Road - MS K101
>    Bedford, MA 01730
>    tel: (781) 271-7702
>    fax: (781) 271-2600
>    mobile: (617) 519-8933
>    email: ssesar at mitre.org <mailto:ssesar at mitre.org>
>  
> =================================== 


-- 
===================================

   Steven L. Sesar
   Lead Operating Systems Programmer/Analyst
   UNIX Application Services R101
   The MITRE Corporation
   202 Burlington Road - MS K101
   Bedford, MA 01730
   tel: (781) 271-7702
   fax: (781) 271-2600
   mobile: (617) 519-8933
   email: ssesar at mitre.org

=================================== 


<Prev in Thread] Current Thread [Next in Thread>