Bacula-users

Re: [Bacula-users] Fwd: Migration Jobs - Won't automount tape, hangs "waiting execution"

2008-06-11 04:32:20
Subject: Re: [Bacula-users] Fwd: Migration Jobs - Won't automount tape, hangs "waiting execution"
From: Kern Sibbald <kern AT sibbald DOT com>
To: "Mingus Dew" <shon.stephens AT gmail DOT com>
Date: Wed, 11 Jun 2008 10:32:03 +0200
Hello Shon,

On Tuesday 10 June 2008 18:10:44 Mingus Dew wrote:
> Kern,
>      I am forwarding this message to you because I haven't gotten any
> answer on the list. I don't most of the time. I hope you might be able to
> explain to me what is happening, or at least how to determine the cause.

Unfortunately, I am not sure that upgrading from 2.2.5 to 2.2.8 will help, 
though you can try it since it should be a simple option. I don't recall any 
changes in the area of max storage or tape loading between those two 
versions.

There was a bug with restores indefinitely reserving a drive in several of the 
2.2.x versions and there have been a number of improvements in mounting 
volumes since version 2.2.8.

One thing that you should do is a "llist volume=xxx"  on the Volume that was 
not automatically loaded, because although you could manually mount it, the 
appropriate information may not be in the catalog (defined slot and InChanger 
set) so that it could be automatically loaded.  This in my opinion is the 
most likely problem.

In any case, I would recommend upgrading to 2.4.0, where all these kinds of 
problems *should* be resolved.  

If the problems are not resolved in 2.4.0, about the only way to resolve these 
kinds of "complex" problems is to create a bug report and include all the 
information you provided below (very good), and in addition turn on debug 
level 150 (-d150 on command line or use setdebug) in the storage daemon and 
capture the output for inclusion in the bug report.  That debug output gives 
us detailed information on what happened up to the point where it blocks.

If you rely on Blastwave for binaries, you might ask them to upgrade from 
their 2.2.8 version to 2.4.0 -- they may not be subscribed to bacula-announce 
and thus not realize that it has been released or not yet had the time to 
build it ...

Best regards,

Kern

>
> Thanks,
> Shon
>
> ---------- Forwarded message ----------
> From: Mingus Dew <shon.stephens AT gmail DOT com>
> Date: Tue, Jun 10, 2008 at 9:01 AM
> Subject: Migration Jobs - Won't automount tape, hangs "waiting execution"
> To: bacula-users <bacula-users AT lists.sourceforge DOT net>
>
>
> All,
>     I am using Version: 2.2.5 (09 October 2007) i386-pc-solaris2.8 solaris
> 5.8 (installed from Blastwave) on a Solaris 10 system. My tape drive is an
> Exabyte Magnum 224 LTO-3 w/ autochanger. My disk and tape jobs run
> regularly with little intervention (sometimes disk jobs fail trying to open
> a new volume while another is in use and I haven't been able to fix that
> either).
>
>     I recently began trying to configure Migration jobs again to move my
> disk jobs to tape on a regular basis. Here are my relevant config
> statements Pool {
>   Name = My_Disks
>   Pool Type = Backup
>   Storage = My_Files
>   Recycle = yes
>   AutoPrune = yes
>   Recycle Oldest Volume = yes
>   Volume Retention = 21d
>   Maximum Volumes = 3
>   LabelFormat = "${Pool}-${NumVols}"
>   Next Pool = My_Tapes
> }
>
> Pool {
>   Name = My_Tapes
>   Pool Type = Backup
>   Recycle = yes
>   AutoPrune = yes
>   Volume Retention = 27d
>   Cleaning Prefix = CLN
>   Storage = Exabyte_224
> }
>
> Job {
>   Name = "My_Weekly_Archive"
>   Type = Migrate
>   Client = my.director
>   FileSet = "Generic_Fileset"
>   Selection Type = SQLQuery
>   Selection Pattern = "select distinct Job.JobId as JobId from
> Client,Job,JobMedia,Media,Pool where Pool.Name = 'My_Disks' and
> Client.ClientId = Job.ClientId and JobStatus = 'T' and JobMedia.JobId =
> Job.JobId and JobMedia.MediaId = Media.MediaId and Job.PoolId = Pool.PoolId
> and Job.StartTime >= '2008-05-26 00:00:00' and Job.StartTime <= '2008-06-01
> 23:59:59' order by Job.JobId"
>   Storage = My_Files
>   Pool = My_Disks
>   Messages = Standard
> }
>
> I manually ran this job last night from bconsole. The first issue that I
> encountered was that the Job did not automount the tape, but rather
> required me to mount it manually. I only found this out after checking the
> director status. The tape the job was requesting was in the library, and a
> simple mount command did mount the volume.
>
> This morning, I checked the status in the director and found this
>
> Running Jobs:
>  JobId Level   Name                       Status
> ======================================================================
>   6715 Full    My_Weekly_Archive.2008-06-09_18.36.09 is waiting on max
> Storage jobs
>   6718 Full    My_Weekly_Archive.2008-06-09_18.36.12 is waiting on max
> Storage jobs
>   6719 Increme  mt-ora2_Daily_Disk.2008-06-09_18.36.13 is waiting execution
>   6720 Full    My_Weekly_Archive.2008-06-09_18.36.14 is waiting on max
> Storage jobs
>   6721 Increme  mt-www1_Daily_Disk.2008-06-09_18.36.15 is waiting execution
>   6722 Full    My_Weekly_Archive.2008-06-09_18.36.16 is waiting on max
> Storage jobs
>   6723 Increme  mt-adm1_Daily_Disk.2008-06-09_18.36.17 is waiting execution
>   6724 Full    My_Weekly_Archive.2008-06-09_18.36.18 is waiting on max
> Storage jobs
>   6725 Increme  mt-adm2_Daily_Disk.2008-06-09_18.36.19 is waiting execution
>   6726 Full    My_Weekly_Archive.2008-06-09_18.36.20 is waiting on max
> Storage jobs
>   6727 Increme  mt-ora2_Daily_Disk.2008-06-09_18.36.21 is waiting execution
>   6728 Full    My_Weekly_Archive.2008-06-09_18.36.22 is waiting on max
> Storage jobs
>   6729 Increme  mt-www1_Daily_Disk.2008-06-09_18.36.23 is waiting execution
>   6730 Full    My_Weekly_Archive.2008-06-09_18.36.24 is waiting on max
> Storage jobs
>   6731 Increme  mt-adm1_Daily_Disk.2008-06-09_18.36.25 is waiting execution
>   6732 Full    My_Weekly_Archive.2008-06-09_18.36.26 is waiting on max
> Storage jobs
>   6733 Increme  mt-adm2_Daily_Disk.2008-06-09_18.36.27 is waiting execution
>   6734 Full    My_Weekly_Archive.2008-06-09_18.36.28 is waiting on max
> Storage jobs
>   6735 Increme  mt-ora2_Daily_Disk.2008-06-09_18.36.29 is waiting execution
>   6736 Full    My_Weekly_Archive.2008-06-09_18.36.30 is waiting on max
> Storage jobs
>   6737 Increme  mt-www1_Daily_Disk.2008-06-09_18.36.31 is waiting execution
>   6738 Full    My_Weekly_Archive.2008-06-09_18.36.32 is waiting on max
> Storage jobs
>   6739 Increme  mt-adm1_Daily_Disk.2008-06-09_18.36.33 is waiting execution
>   6740 Full    My_Weekly_Archive.2008-06-09_18.36.34 is waiting on max
> Storage jobs
>   6741 Increme  mt-adm2_Daily_Disk.2008-06-09_18.36.35 is waiting execution
>
> There were no running jobs. I decided to check the storage status.
>
> Select Storage resource (1-7): 7
> Connecting to Storage daemon Exabyte_224 at my.director.:9103
>
> my.storage Version: 2.2.5 (09 October 2007) i386-pc-solaris2.8 solaris 5.8
> Daemon started 03-Jun-08 12:50, 389 Jobs run since started.
>  Heap: heap=3,268,608 smbytes=2,484,196 max_bytes=2,820,635 bufs=211
> max_bufs=396
> Sizes: boffset_t=8 size_t=4 int32_t=4 int64_t=8
>
> Running Jobs:
> No Jobs running.
> ====
>
> Jobs waiting to reserve a drive:
> ====
>
> Terminated Jobs:
>  JobId  Level    Files      Bytes   Status   Finished        Name
> ===================================================================
>   6803  Incr         44    637.3 K  OK       10-Jun-08 05:00
> ad-eweb2_Daily_Disk
>   6802  Incr         46    21.74 M  OK       10-Jun-08 05:00
> ad-eweb1_Daily_Disk
>   6804  Incr      1,220    457.7 M  OK       10-Jun-08 05:00
> ad-web1_Daily_Disk
>   6806  Incr         67    45.22 M  OK       10-Jun-08 05:00
> ad-ora2_Daily_Disk
>   6805  Incr        130    26.80 M  OK       10-Jun-08 05:01
> ad-ora1_Daily_Disk
>   6798  Incr        121    468.2 M  OK       10-Jun-08 05:02
> ad-crmdb1_Daily_Disk
>   6800  Incr         68    709.5 M  OK       10-Jun-08 05:03
> ad-crmweb1_Daily_Disk
>   6807  Incr        195    346.5 M  OK       10-Jun-08 05:04
> ad-pdf1_Daily_Disk
>   6801  Incr         68    789.5 M  OK       10-Jun-08 05:04
> ad-crmweb2_Daily_Disk
>   6808  Incr        850    409.9 M  OK       10-Jun-08 05:04
> ad-pdf2_Daily_Disk
> ====
>
> Device status:
> Autochanger "Magnum_224" with devices:
>    "Ultrium-TD3" (/dev/rmt/0cbn)
> Device "Ultrium-TD3" (/dev/rmt/0cbn) is mounted with:
>     Volume:      A00025
>     Pool:        My_Tapes
>     Media type:  LTO-3
>     Slot 2 is loaded in drive 0.
>     Total Bytes=558,751,334,400 Blocks=8,661,199 Bytes/block=64,512
>     Positioned at File=563 Block=0
> ====
>
> In Use Volume status:
> A00025 on device "Ultrium-TD3" (/dev/rmt/0cbn)
>     Reader=0 writers=0 reserved=0
> ====
>
> Data spooling: 0 active jobs, 0 bytes; 7 total jobs, 1,747,693,853,212 max
> bytes/job.
> Attr spooling: 0 active jobs, 0 bytes; 7 total jobs, 486,702 max bytes.
>
> I'm not sure what else to even consider. I would like to try and figure out
> why it was waiting on max Storage, with no other jobs running. This is the
> 2nd time this has happened while testing Migration Jobs.
> I am not ready to shout "bug", but it could be possible. I was considering
> upgrading to 2.2.8 (latest available on Blastwave) to see if that helped
> the situation or not. Any ideas out there?
>
> -Shon



-------------------------------------------------------------------------
Check out the new SourceForge.net Marketplace.
It's the best place to buy or sell services for
just about anything Open Source.
http://sourceforge.net/services/buy/index.php
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>