Bacula-users

Re: [Bacula-users] Jobs waiting on Storage / Volume pool not assigned

2010-11-18 13:02:16
Subject: Re: [Bacula-users] Jobs waiting on Storage / Volume pool not assigned
From: Christopher Strider Cook <ccook AT pandora DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 18 Nov 2010 09:44:30 -0800
If there's any more information I can provide to allow someone to assist 
in resolving this, please let me know.

It really seems as though on some long running multi-volume jobs that 
additional volumes aren't getting their pool assigned until after the 
job finishes, and that this is preventing other jobs from running 
concurrently.

Thanks, any help would be great.

Chris


On 11/16/10 4:11 PM, Christopher Strider Cook wrote:
> I have what is otherwise a very large and successful Bacula installation
> that has been up and running for years.
>
> Currently running from Debian lenny-backports version 5.0.2-1~bpo50+.
>
> The issue I'm currently running into is that jobs are stuck:
>
> --stat dir
>    28010 Full    ray.2010-11-13_20.25.01_31 is running
>    28022 Differe  waters.2010-11-13_20.25.01_43 is waiting on Storage Paul02
>    28024 Differe  zebda.2010-11-13_20.25.01_45 is waiting on Storage Paul02
>    28026 Differe  latifah.2010-11-13_20.25.01_47 is waiting on Storage Paul02
>    28028 Differe  betadb0.2010-11-13_20.25.02_49 is waiting on Storage Paul02
>    28057 Full    maybelle-P4.2010-11-14_03.15.00_22 is waiting on Storage
> Paul02
>    28202 Increme  blondie.2010-11-15_20.25.00_04 is waiting on max
> Storage jobs
>    28208 Increme  costello.2010-11-15_20.25.01_10 is waiting on max
> Storage jobs
>    28212 Increme  django.2010-11-15_20.25.01_14 is waiting on max Storage
> jobs
>    28214 Increme  electro.2010-11-15_20.25.02_16 is waiting on max
> Storage jobs
>
> --Storage Status
>
> 3608 JobId=28022 wants Pool="Paul02" but have Pool="" nreserve=0 on
> drive "Paul02" (/archive/PAUL-nufa/Paul02).
>
> 3608 JobId=28024 wants Pool="Paul02" but have Pool="" nreserve=0 on
> drive "Paul02" (/archive/PAUL-nufa/Paul02).
>
> 3608 JobId=28026 wants Pool="Paul02" but have Pool="" nreserve=0 on
> drive "Paul02" (/archive/PAUL-nufa/Paul02).
>
> 3608 JobId=28028 wants Pool="Paul02" but have Pool="" nreserve=0 on
> drive "Paul02" (/archive/PAUL-nufa/Paul02).
>
> 3608 JobId=28057 wants Pool="Paul02" but have Pool="" nreserve=0 on
> drive "Paul02" (/archive/PAUL-nufa/Paul02).
> --
> Device "Paul02" (/archive/PAUL-nufa/Paul02) is mounted with:
>       Volume:      Paul02-9267
>       Pool:        *unknown*
>       Media type:  File50
>       Total Bytes=19,733,575,217 Blocks=305,890 Bytes/block=64,511
>       Positioned at File=4 Block=2,553,706,032
> --
>
> When they should be running concurrently. All these jobs have the same
> storage and pool specified and max concurrent jobs is set properly for
> storage and pools. I know this because normally it all works fine. The
> hold ups tend to happen on long running large datasets where multiple
> volumes (File based storage) are created.
>
> Storage {
>     Name = Paul02
>     Address = deimos.savagebeast.com
>     SDPORT = 9103
>     Password = "x"
>     Device = Paul02
>     Media Type = File50
>     Maximum Concurrent Jobs = 6
> }
> Pool {
>     Name = Paul02
>     Storage = Paul02
>     Pool Type = Backup
>     Recycle = no
>     AutoPrune = yes
>     Volume Retention = 37 days
>     Use Volume Once = no
>     Maximum Volume Bytes = 25 GB
>     Label Format = "Paul02-"
>     Action On Purge = Truncate
>     Next Pool = Copy02
> }
>
> <bacula.sd>  --
> Storage {                             # definition of myself
>     Name = deimos-sd
>     SDPort = 9103                  # Director's port
>     WorkingDirectory = "/var/lib/bacula"
>     Pid Directory = "/var/run/bacula"
>     Maximum Concurrent Jobs = 20
> }
> Director {
>     Name = tavern-dir
>     Password = "x"
> }
> Device {
>     Name = Paul02
>     Device Type = File
>     Media Type = File50
>     Archive Device = /archive/PAUL-nufa/Paul02
>     Random Access = yes
>     Label Media = yes
>     Requires Mount = no
>     Removable Media = no
>     Always Open = yes
> }
>
> ---
>
> The only thing I have noted is that the storage status lists the volume
> pool as "*unknown*", which I understand the storage daemon to believe
> before the director tells it to write a job, but as you can see the
> volume is actively being written to by the running job. Are the jobs
> 'waiting on Storage' waiting because they don't see a volume with the
> correct pool mounted? Why isn't the pool assigned properly? When the
> volume fills and it moves onto the next the pool is set properly.
>
> Have I missed a step in the configuration or is this some sort of bug?
>
> Thanks
>
> Chris
>
> -- other configs
> Client {
>     Name = ray
>     Address = ray.savagebeast.com
>     FDPORT = 9102
>     Catalog = MyCatalog
>     Password = "x"
>     File Retention = 30 days
>     Job Retention = 37 days
>     AutoPrune = yes
>     Maximum Concurrent Jobs = 4
> }
>
> Job {
>     Name = ray
>     Client = ray
>     JobDefs = "PandoraClient"
>     Pool = Paul02
>     Schedule = PandoraCycle1
>     FileSet = "General-host"
> }
>
> Client {
>     Name = waters
>     Address = waters.savagebeast.com
>     FDPORT = 9102
>     Catalog = MyCatalog
>     Password = "x"
>     File Retention = 30 days
>     Job Retention = 37 days
>     AutoPrune = yes
>     Maximum Concurrent Jobs = 4
> }
>
> Job {
>     Name = waters
>     Client = waters
>     JobDefs = "PandoraClient"
>     Pool = Paul02
>     Schedule = PandoraCycle1
>     FileSet = "General-host"
> }
>
>
> ------------------------------------------------------------------------------
> Beautiful is writing same markup. Internet Explorer 9 supports
> standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2&  L3.
> Spend less time writing and  rewriting code and more time creating great
> experiences on the web. Be a part of the beta today
> http://p.sf.net/sfu/msIE9-sfdev2dev
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today
http://p.sf.net/sfu/msIE9-sfdev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>