Bacula-users

Re: [Bacula-users] Job is waiting on Storage

2010-08-31 15:03:40
Subject: Re: [Bacula-users] Job is waiting on Storage
From: Marco Lertora <marco.lertora AT infoporto DOT it>
To: Bacula-users AT lists.sourceforge DOT net
Date: Tue, 31 Aug 2010 21:00:08 +0200
Marco Lertora wrote:
  Il 31/08/2010 17.27, Bill Arlofski ha scritto:
  
On 08/31/10 08:44, Marco Lertora wrote:
    
   Hi!

I've the same problem! anyone found a solution?

I have 3 concurrent jobs, which backup from different fd to the same
device on sd.
All jobs use the same pool and the pool use "Maximum Volume Bytes" as
volume splitting policy, as suggested in docs.
All job has the same priority.

Everything starts good, but after some volumes changes (becouse they
reach the max volume size) the storage lost the pool information of the
mounted volume
So, the jobs started after that, wait on sd for a mounted volume with
the same pool as the one wanted by the job.

Regards
Marco Lertora
      
Sorry for a "me too" post... But:


I have been noticing the same thing here.  I just have not been able to
monitor it and accurately document it.

Basically it appears to be exactly what you have stated above. I am also using
only disk storage with my "file tapes" configured to be a maximum of 10GB each.

I have seen a "status dir"  show me  "job xxx waiting on storage" and have
noted that the job(s) waiting are of the same priority as the job(s) currently
running and are configured to use the same device and pool.

I have also noticed exactly what Lukas Kolbe described here where the job
wants one pool, but thinks it has a "null named pool":

    
3608 JobId=308 wants Pool="dp" but have Pool=""
      
and here where the device is mounted, the volume name is known but the pool is
unknown:

    
Device "dp1" (/var/bacula/diskpool/fs1) is mounted with:
      Volume:      Vol0349
      Pool:        *unknown*
      Media type:  File
      Total Bytes=11,726,668,867 Blocks=181,775 Bytes/block=64,512
      Positioned at File=2 Block=3,136,734,274
      
So by all indications the job(s) that are "waiting on storage" should be
running but are instead needlessly waiting.


Initially, my thought was that I had the Pool in the jobs defined like:

Pool = Default

and the Default pool had no tapes in it - Bacula requires a Pool to be defined
in a Job definition - Which is why I used "Default", but I was overriding the
Pool in the Schedule like so:

Schedule {
   Name = WeeklyToOffsiteDisk
         Run = Full              pool=Offsite-eSATA      sun     at 20:30
         Run = Incremental       pool=Offsite-eSATA-Inc  mon-fri at 20:30
         Run = Differential      pool=Offsite-eSATA-Diff sat     at 20:30
}


I have recently reconfigured my system to use one pool "Offsite-eSATA" and
have set:

Pool = Offsite-eSATA

directly in all of the the Job definitions instead of using the Schedule
override, but I am still seeing what you both have described.
    
Hi,
I've try to increse sd log with setdebug option but, no luck.
I've try to look in source, but they are quite complex so, no luck

this is the code where the match fail:

  
static int is_pool_ok(DCR *dcr)
{
   DEVICE *dev = dcr->dev;
   JCR *jcr = dcr->jcr;

   /* Now check if we want the same Pool and pool type */
   if (strcmp(dev->pool_name, dcr->pool_name) == 0 &&
       strcmp(dev->pool_type, dcr->pool_type) == 0) {
      /* OK, compatible device */
      Dmsg1(dbglvl, "OK dev: %s num_writers=0, reserved, pool 
matches\n", dev->print_name());
      return 1;
   } else {
      /* Drive Pool not suitable for us */
      Mmsg(jcr->errmsg, _(
"3608 JobId=%u wants Pool=\"%s\" but have Pool=\"%s\" nreserve=%d on 
drive %s.\n"),
            (uint32_t)jcr->JobId, dcr->pool_name, dev->pool_name,
            dev->num_reserved(), dev->print_name());
      queue_reserve_message(jcr);
      Dmsg2(dbglvl, "failed: busy num_writers=0, reserved, pool=%s 
wanted=%s\n",
         dev->pool_name, dcr->pool_name);
   }
   return 0;
}
    
I suppose dev->pool_name was empty. this is confirmed by the code where
status message is build

  
         if (dev->is_labeled()) {
            len = Mmsg(msg, _("Device %s is mounted with:\n"
                              "    Volume:      %s\n"
                              "    Pool:        %s\n"
                              "    Media type:  %s\n"),
               dev->print_name(),
               dev->VolHdr.VolumeName,
               dev->pool_name[0]?dev->pool_name:"*unknown*",
               dev->device->media_type);
            sendit(msg, len, sp);
         } else {
    
but I can't find where this property is set.
it happen in some but not all volume change and I think when storage or
probably a device end all running jobs

any bacula guru or developer can hear us?
  

I forgot:
my bacula version is: 5.0.2
this issue should be linked to bug: 1541
http://bugs.bacula.org/view.php?id=1541

Marco

  
--
Bill Arlofski
Reverse Polarity, LLC

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
    

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
  

------------------------------------------------------------------------------
This SF.net Dev2Dev email is sponsored by:

Show off your parallel programming skills.
Enter the Intel(R) Threading Challenge 2010.
http://p.sf.net/sfu/intel-thread-sfd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>