Bacula-users

Re: [Bacula-users] SD Losing Track of Pool

2011-01-20 10:22:55
Subject: Re: [Bacula-users] SD Losing Track of Pool
From: Peter Zenge <pzenge AT ilinc DOT com>
To: "bacula-users AT lists.sourceforge DOT net" <bacula-users AT lists.sourceforge DOT net>
Date: Thu, 20 Jan 2011 08:18:35 -0700
> From: Martin Simmons [mailto:martin AT lispworks DOT com]
> Sent: Thursday, January 20, 2011 4:28 AM
> To: bacula-users AT lists.sourceforge DOT net
> Subject: Re: [Bacula-users] SD Losing Track of Pool
> 
> >>>>> On Tue, 18 Jan 2011 08:48:56 -0700, Peter Zenge said:
> >
> > A couple days ago somebody made a comment that using pool overrides
> in a
> > schedule was deprecated.  I've been using them for years, but I've
> been
> > seeing a strange problem recently that I'm thinking might be related.
> >
> > I'm running 5.0.2 on Debian, separate Dir/Mysql and SD systems, using
> files
> > on an array.  I'm backing up several TB a week, but over a slow
> 25Mbps link,
> > so some of my full jobs run for a very long time.  Concurrency is
> key.  I
> > normally run 4 jobs at a time on my SD, and I spool (yes, probably
> > unnecessary, but because the data is coming in so slowly, I feel
> better
> > about writing it to volumes in big chunks.)
> >
> > Right now I have one job actively running, with 4 more waiting on the
> SD.
> > As I mentioned before, usually 4 are running concurrently, but I
> frequently
> > see less than 4 but have never really dug into it.  In the output
> below,
> > note that the SD is running 4 (actually 5!) jobs, but only one is
> actually
> > writing to the spool.  Two things jump out at me here: First, of the
> 5
> > running jobs, two are correctly noted as being for LF-Full, and 3 for
> LF-Inc
> > (pool for Full backups and pool for Incremental backups
> respectively).
> > However, all 5 show the same volume (LF-F-0239, which is only in the
> LF-Full
> > pool, and is currently being written to by the correctly-running
> job).
> > Second, in the Device Status section at the bottom, the pool of LF-F-
> 0239 is
> > listed as "*unknown*"; similarly, under "Jobs waiting to reserve a
> drive",
> > each job wants the correct pool, but the current pool is listed as
> "".
> 
> The reporting of pools in the SD might be a little wrong, because it
> doesn't
> really have that information, but I think the fundamental problem is
> that you
> only have one SD device.  That is limiting concurrency because an SD
> device
> can only mount one volume at a time (even for file devices).
> 
> __Martin
> 

Admittedly I confused the issue by posting an example with two Pools involved.  
Even in that example though, there were jobs using the same pool as the mounted 
volume, and they wouldn't run until the 2 current jobs were done (which 
presumably allowed the SD to re-mount the same volume, set the current mounted 
pool correctly, and then 4 jobs were able to write to that volume concurrently, 
as designed.

I saw this issue two other times that day; each time the SD changed the mounted 
pool from "LF-Inc" to "*unknown*" and that brought concurrency to a screeching 
halt.

Certainly I could bypass this issue by having a dedicated volume and device for 
each backup client, but I have over 50 clients right now and it seems like that 
should be unnecessary.  Is that what other people who write to disk volumes do?

------------------------------------------------------------------------------
Protect Your Site and Customers from Malware Attacks
Learn about various malware tactics and how to avoid them. Understand 
malware threats, the impact they can have on your business, and how you 
can protect your company and customers by using code signing.
http://p.sf.net/sfu/oracle-sfdevnl
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users