Bacula-users

Re: [Bacula-users] [Bacula-devel] VirtualFull backup: bacula selects the wrong read device

2009-10-11 13:18:06
Subject: Re: [Bacula-users] [Bacula-devel] VirtualFull backup: bacula selects the wrong read device
From: Kern Sibbald <kern AT sibbald DOT com>
To: Nicolae Mihalache <mache AT abcpages DOT com>
Date: Sun, 11 Oct 2009 19:15:01 +0200
On Sunday 11 October 2009 18:32:12 Nicolae Mihalache wrote:
> Hello, I can only test tomorrow because now I have some long running
> backups.

OK, no problem.

>
> But isn't your change affecting the jcr->dcr which is the writing device?

I think that is the point.  jcr->dcr (should really be named jcr->write_dcr)  
should not be changed when switching the read device, and I think my change 
corrects that problem.

>
> Have you tried running a virtual full or just a restore?

I ran both VirtualFull and restore jobs (the whole regression suite), but for 
the VirtualFull, we do not currently have any regression tests that require 
switching drives, which is why I would like to know if my fix works with your 
specific problem.  

> I guess for a 
> restore it is working because no writing device is being used.

Yes, that is exactly it.  During a restore, the jcr->dcr is not used, so if it 
is "destroyed" (well changed) there is really no problem.  During a 
VirtualFull, Migration, or Copy, changing jcr->dcr will create problems, 
which is what I believe you are seeing.

By the way, the new_dcr(), form me is OK as is.  It does indeed create a new 
dcr, but if it is possible to reuse the existing one, it does rather than 
mallocing a new one.  In any case, any change to new_dcr() is a bit more 
disruptive than I would like to do just now.

Best regards,

Kern

>
>
> nicolae
>
> Kern Sibbald wrote:
> > Hello,
> >
> > That is *very* interesting.  Thanks for looking into this.  I think the
> > code has previously worked for us because we may not have explicitly
> > tested Migration and VirtualFull, but for restores it all works fine.
> >
> > Could you try one change to your fix?  Remove your fix, then change line
> >
> > reserve.c:637 from:
> >
> >    rctx.jcr->dcr = dcr = new_dcr(rctx.jcr, rctx.jcr->dcr,
> > rctx.device->dev);
> >
> > to
> >
> >    dcr = new_dcr(rctx.jcr, rctx.jcr->dcr, rctx.device->dev);
> >
> > I think this will solve the problem in a slightly cleaner way.
> >
> > If the above works, I'll be happy to integrate it and get it out in 3.0.3
> >
> > Best regards,
> >
> > Kern
> >
> > On Sunday 11 October 2009 12:56:05 Nicolae Mihalache wrote:
> >> I investigated a little more and found that if I replace the line
> >>
> >>     rctx.jcr->dcr = dcr = new_dcr(rctx.jcr, rctx.jcr->dcr,
> >> rctx.device->dev);
> >>
> >> with
> >>    if (rctx.store->append == SD_READ) {
> >>       rctx.jcr->read_dcr = dcr = new_dcr(rctx.jcr, rctx.jcr->read_dcr,
> >> rctx.device->dev);
> >>    } else {
> >>       rctx.jcr->dcr = dcr = new_dcr(rctx.jcr, rctx.jcr->dcr,
> >> rctx.device->dev);
> >>    }
> >>
> >> in reserve.c:637, it will work ok.
> >> I was confused by the function new_dcr which despite its name, doesn't
> >> create a new dcr if it already exists ( I think the name should be
> >> changed).
> >>
> >> nicolae
> >>
> >> Kern Sibbald wrote:
> >>> On Friday 09 October 2009 15:06:36 Nicolae Mihalache wrote:
> >>>> Sorry, I'm not quite sure I grasp your message.
> >>>> When you refer to Storage in "Changing Storage for restore and
> >>>> migration", you talk about Storage Daemon or the Storage resource in
> >>>> the director?
> >>>
> >>> I was not very precise.  I meant that changing Storage daemons is not
> >>> supported in any released version.  Changing storage devices is also
> >>> not officially supported in any released version, but there is code
> >>> that seems to work in most cases.
> >>>
> >>>> I have one Storage Daemon with three different devices and three
> >>>> corresponding storage resources in the director. What I want to do is
> >>>> to read from two of those devices and write into the third one.
> >>>> Changing between the two reading ones is not working. I did a bit of
> >>>> debugging, and I found that:
> >>>>
> >>>> In stored/acquire.c
> >>>>
> >>>>       stat = search_res_for_device(rctx);
> >>>>       release_reserve_messages(jcr);         /* release queued
> >>>> messages */ unlock_reservations();
> >>>>
> >>>>       if (stat == 1) {
> >>>>          dev = dcr->dev;                     /* get new device pointer
> >>>> */
> >>>>
> >>>> In stored/reserve.c:
> >>>>
> >>>>  ok = reserve_device_for_read(dcr);
> >>>>       if (ok) {
> >>>>          rctx.jcr->read_dcr = dcr;
> >>>>
> >>>>
> >>>> So basically the jcr->read_dcr is changed to point to a brand new dcr
> >>>> in reserve.c but the code in acquire.c is using the old dcr. Also in
> >>>> acquire.c there is a comment that says dcr pointer shouldn't change.
> >>>
> >>> If indeed the code in reserve.c is setting an entirely new dcr pointer,
> >>> then it is unlikely the code will work.  So that may be the problem you
> >>> are running into.  However, in all our tests, it does seem to work
> >>> providing you have different MediaTypes for each device. Though as I
> >>> say, it is not officially supported, simply because I don't have enough
> >>> confidence in the code. If you have the same MediaType for different
> >>> devices, then switching devices won't work.
> >>>
> >>> Regards,
> >>>
> >>> Kern
> >>>
> >>>> nicolae
> >>>>
> >>>> Kern Sibbald wrote:
> >>>>> Hello,
> >>>>>
> >>>>> You are attempting to use a feature that is not implemented.
> >>>>>
> >>>>> Changing Storage for restore and migration is not implemented in any
> >>>>> released version of Bacula.
> >>>>>
> >>>>> Changing of Storage for restore is implemented in 3.1.x (not yet
> >>>>> released). However, it is unlikely that it works for migration.
> >>>>>
> >>>>> Changing of devices within a given Storage should be implemented, for
> >>>>> reading of any kind, but we do not guarantee that it works -- in your
> >>>>> email, you did not specify much information so I cannot be more
> >>>>> specific.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Kern
> >>>>>
> >>>>> On Friday 09 October 2009 12:37:50 Nicolae Mihalache wrote:
> >>>>>> Hello,
> >>>>>>
> >>>>>> I run Full backups on MediaType=LOT-4 and Incremental backups on
> >>>>>> MediaType=File. When running a VirtualFull into a tmp pool
> >>>>>> (MediaType=TmpFile), bacula starts correctly reading the full backup
> >>>>>> from LTO-4. However, when the Incremental starts, it still wants to
> >>>>>> use the LTO-4 device instead of switching to the file device:
> >>>>>>
> >>>>>> 09-Oct 11:43 bacula-sd JobId 8889: End of Volume at file 46 on
> >>>>>> device "HP" (/dev/st0), Volume "GROUPS-01" 09-Oct 11:44 bacula-sd
> >>>>>> JobId 8889: acquire.c:116 Changing read device. Want Media
> >>>>>> Type="File"
> >>>>>> have="LTO-4" device="HP" (/dev/st0) 09-Oct 11:44 bacula-sd JobId
> >>>>>> 8889: Media Type change.  New read device "HP" (/dev/st0) chosen.
> >>>>>> 09-Oct 11:44 bacula-sd JobId 8889: Invalid slot=0 defined in catalog
> >>>>>> for Volume "aa-work-0125" on "HP" (/dev/st0). Manual load may be
> >>>>>> required.
> >>>>>>
> >>>>>> >From these logs it seems it wants to change the read device but for
> >>>>>>>
> >>>>>>> some
> >>>>>>
> >>>>>> reason selects the same (wrong) one.
> >>>>>>
> >>>>>>
> >>>>>> Thanks for any hints.
> >>>>>> nicolae
> >>>>
> >>>> ----------------------------------------------------------------------
> >>>>-- --- --- Come build with us! The BlackBerry(R) Developer Conference
> >>>> in SF, CA is the only developer event you need to attend this year.
> >>>> Jumpstart your developing skills, take BlackBerry mobile applications
> >>>> to market and stay ahead of the curve. Join us from November 9 - 12,
> >>>> 2009. Register now! http://p.sf.net/sfu/devconference
> >>>> _______________________________________________
> >>>> Bacula-devel mailing list
> >>>> Bacula-devel AT lists.sourceforge DOT net
> >>>> https://lists.sourceforge.net/lists/listinfo/bacula-devel



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users