Bacula-users

Re: [Bacula-users] [Bacula-devel] Minor SD Feature Request

2008-05-13 12:41:07
Subject: Re: [Bacula-users] [Bacula-devel] Minor SD Feature Request
From: Kern Sibbald <kern AT sibbald DOT com>
To: "Weber, Philip" <Philip.Weber AT egg DOT com>
Date: Tue, 13 May 2008 17:50:19 +0200
Hello Phil,

Thanks for the feedback and for confirming my concerns.  I don't plan to 
change the behavior, but sometime in the future, we may add new code along 
the lines that was discussed in the Feature request.  However, if we do so, 
it will require the Bacula sysadmin to carefully specify exactly what slots 
the SD can consider that it "owns" -- that way, things should work as they 
are currently for you.

In addition to sites like yours, the current behavior is really critical to 
maintain (i.e. Bacula doesn't consider that it owns the whole autochanger) so 
that we can phase Bacula into sites replacing their current backup solution 
over a period of time (i.e. sharing an autochanger will be important for 
that).

Best regards,

Kern

On Tuesday 13 May 2008 11:26:10 Weber, Philip wrote:
> We have Bacula set up to share a large library with another backup
> system.  I think that if Bacula was changed to assume it had control of
> the whole library or a range of slots, it would cause us problems - no
> doubt we could script around those problems but I hope we don't have to.
> As it is, with both backup systems putting tapes back into the slots
> from which it originally took them, the two systems co-exist quite well.
> We do tend to have problems when we load new sets of tapes (through the
> other backup system), but this is something I expect to solve with a bit
> of scripting outside of both backup systems, which is how I think things
> like this should work.  I'll be looking at the script attached earlier -
> thanks.
>
> cheers, Phil
>
> Phil Weber MBCS CITP
> Storage Technical Services - Senior UNIX Technologist
> Business Technology
>
> Egg Banking plc
>
> -----Original Message-----
> From: bacula-users-bounces AT lists.sourceforge DOT net
> [mailto:bacula-users-bounces AT lists.sourceforge DOT net] On Behalf Of Kern
> Sibbald
> Sent: 11 May 2008 16:30
> To: bacula-devel AT lists.sourceforge DOT net
> Cc: Arno Lehmann; bacula-users AT lists.sourceforge DOT net
> Subject: Re: [Bacula-users] [Bacula-devel] Minor SD Feature Request
>
> On Sunday 11 May 2008 12:45:28 Arno Lehmann wrote:
> > Hi,
> >
> > 11.05.2008 11:37, Kern Sibbald wrote:
> > > Hello,
> > >
> > > OK, thanks. You have confirmed what I suspected.  In effect, this is
> > > really a support problem - I suspect you not fully understanding how
> > > Bacula works and its limitations (explained below).
> >
> > Well, this is something that we discussed on the -users list, and as
> > far as I can tell, Blake pretty well understands the way Bacula works
> > and has implemented procedures to do things right,
>
> Well, I am happy to hear that.
>
> > but the autochanger
> > itself is causing the trouble. (By loading imported volumes to
> > whatever slots are available automatically).
>
> Well, if one follows the full procedure I outlined, the problems
> mentioned
> would not happen unless I am missing some other problem.
>
> > > First, without some additional design and coding, it is not possible
>
> for
>
> > > Bacula to snoop around on the autochanger for an available slot in
>
> which
>
> > > to unload a volume.
> >
> > Well, with the discussed slots status query by the SD a good part of
> > the design would already exist.
>
> The SD can query slots, but it cannot know what slot to use.  That
> information
> comes only from the Director and the Director gets that information from
> what
> the user (via bconsole commands) from info he put into the catalog.
> Without
> the appropriate entries in the catalog (or perhaps some express user
> command), the SD by itself will not directly access any slots.  This
> permits
> sharing of autochangers without having the autochanger physically
> partitioned, which is most often not possible.  It is a fundamental
> design
> concept of the current Bacula code, and as I mentioned below, it can be
> changed, but would require a design, a feature request, and scheduling
> implementation.  It probably involves new directives and new
> communications
> between the DIR and the SD.
>
> > >  The autochanger may have hundreds of slots, with only a few
> > > available for Bacula, and currently there is no way to tell Bacula
>
> that
>
> > > it "owns" slots n-m  (this could be a future enhancement).  As a
> > > consequence, with the current design, Bacula must always unload a
>
> volume
>
> > > into the slot from which it came.
> >
> > I disagree... note that I'm not talking about shared autochangers
> > (which would best be shared by partitioning a big library, i.e.
> > relying on the library hardware to keep track of which slots belong to
> > which logical autochanger).
> >
> > So I assume Bacula has the autochanger it sees all for itself.
>
> That is not an assumption that Bacula makes, and if we changed it to
> make that
> assumption, without the project mentioned above, my guess is that it
> would
> totally break things in some large shops.
>
> Someone used to big shops like David Boyes might be able to give a bit
> more
> insight here.
>
> > In this case it would be possible to list the slots, look for unused
> > ones, and unload the current tape to one of these, updating that
> > catalog accordingly (the mtx-changer script also supports this, by the
> > way).
>
> That is possible, and I mentioned it, but it falls into a new design
> needing
> new code, and hence is not something that can be simply "patched" in --
> i.e.
> it is not a bug fix.
>
> > > Second, from the above you should have gathered that if you manually
>
> load
>
> > > a volume into a slot where Bacula has loaded a volume from that slot
>
> into
>
> > > a drive, at some point everything is going to fail as you are
>
> seeing.
>
> > Yup, though the term "manually" is misleading in this scenario...
>
> I meant that it was something done by a human intervention rather than
> by
> Bacula.  In some autochangers like mine, new volumes are introduced
> manually
> directly into the slot.  With bigger autochangers, there are mail slots
> and
> such through which the operator can manually enter new volumes that are
> then
> loaded into appropriate slots by the autochanger. There are probably
> even
> other schemes ...
>
> > > When you change something in the autochanger the preferred way of
>
> doing
>
> > > so is:
> > >
> > > -- first unmount all drives that Bacula has mounted
> > > -- change the autochanger volumes
> > > -- do an update slots
> > > -- finally remount the drives with Bacula.
> >
> > I think Blake knows that.
>
> I believe that if he were doing at least the first step his major
> complaint in
> the Feature request would be completely resolved.  This would not
> resolve the
> additional problems he is seeing if the procedure for doing the update
> slots
> fails.
>
> > > It is possible to rearrange the volumes in the autochanger without
> > > unloading all the drives providing that Bacula doesn't want to
> > > load/unload any volume while you are changing things in in the
> > > autochanger.  I strongly recommend against doing this, but it is
>
> possible
>
> > > in a situation where Bacula is running a job.  Doing so is not
>
> without
>
> > > risks though.
> > >
> > > If you don't follow these simple rules, Bacula will sooner or later
>
> fail,
>
> > > and probably the worst case is if you load a volume into a slot
>
> where
>
> > > there is a volume in one of the drives.
> > >
> > > I do believe that we could improve how Bacula handles Volumes found
>
> in
>
> > > Slots where they are not expected, and I will look at that, but for
>
> the
>
> > > moment, having Bacula unload a volume into a different slot than
>
> from
>
> > > where it came is a much bigger project that if well designed and
>
> accepted
>
> > > would be a feature after the next major release (3.0.0).
> >
> > Well, I won't argue here, but I believe the design work needed is not
> > that complex.
>
> I agree with you it is not that complex. Clarification: "complex" is not
> a
> word I used or meant to imply.
>
> Best regards,
>
> Kern
>
> > > Summary:
> > > - I cannot accept your Feature Request as formulated without
>
> additional
>
> > > design work so that it won't break shared autochangers.
> > >
> > > - You can resolve your problems by implementing improved sysadmin
> > > procedures.
> >
> > Perhaps... ok, attached is a starting point. This is a script I use to
> > help managing autoloaders, especially unloading full volumes and
> > loading new ones.
> >
> > I recommend that you very carefully test it - it's more a hack that
> > grew into a rather large program (at least for my coding skills...)
> > and I'm quite sure it can be improved a lot.
> >
> > If I had the time I know that I could rework much of it to become more
> > generally useabl and better structured.
> >
> > This script is know to work in production environments, but still - no
> > warranties, you are all on your own, and so on.
> >
> > Arno
> >
> > > Regards,
> > >
> > > Kern
> > >
> > > PS: When unmounting, you do specify an Autochanger, but since
> > > autochangers may have multiple drives, you must specify which drive
>
> of
>
> > > the autochanger.  If you have only one drive, entering a return at
>
> the
>
> > > question is all that is necessary to do the right thing.
> > >
> > > On Saturday 10 May 2008 19:22:47 Blake Dunlap wrote:
> > >>> Hello Blake,
> > >>>
> > >>> One part of Bacula that I would like to improve just a bit (not
>
> too
>
> > >>> much coding for the moment) for the next release is the
>
> information
>
> > >>> returned for
> > >>> Autochangers.  Currently, it seems to me that the sysadmin has
>
> very
>
> > >>> little information about the actual state of the autochanger via
>
> the
>
> > >>> console interface.  Although your suggestion seems to be a bit
>
> more
>
> > >>> than simple reporting of the status, I am interested in it.  The
> > >>> problem is that I don't
> > >>> understand what you are asking for well enough to possibly
>
> implement
>
> > >>> something.
> > >>>
> > >>> Could you be much more explicit with what you want, perhaps giving
>
> an
>
> > >>> explicit
> > >>> example of what happens now and what you would like to see happen.
> > >>>
> > >>> Don't forget that at the current time, Bacula has no concept of
> > >>> changing the slot -- for example, when a Volume is loaded by
>
> Bacula
>
> > >>> from Slot 2 into the
> > >>> drive, it *must* be returned to the same Slot.  Changing this
>
> behavior
>
> > >>> is a
> > >>> project that would require significant design and thought and is
> > >>> probably not
> > >>> something we would want to implement in the near future.
> > >>>
> > >>> On the other hand, I think there is a lot of need and possibility
>
> for
>
> > >>> making
> > >>> Bacula much smarter at automatically recognizing that a Volume is
>
> in a
>
> > >>> different Slot from what is written in the database.  Currently
>
> such
>
> > >>> volumes
> > >>> are marked in error (if I remember right), but we could consider
>
> simply
>
> > >>> correcting the info in the database.
> > >>>
> > >>> Best regards,
> > >>>
> > >>> Kern
> > >>
> > >> It is the last paragraph that I am mostly looking at dealing with.
>
> Let
>
> > >> me give our situation in depth and I think that will explain what I
>
> am
>
> > >> looking for.
> > >>
> > >> We have a 2 drive auto-changer and run 4 pools of backups
>
> (Incremental,
>
> > >> OnSiteFull, OffsiteFull, and OnsiteMonthly). We run two sets of
>
> backups
>
> > >> for clients, an offsite backup that runs every Friday night (due to
>
> the
>
> > >> lack of copy pools etc), and the OnSite backups which occur every
>
> night
>
> > >> incremental, except Saturday night which is a full (the pool is
> > >> overridden to Monthly the first sat of a month). Anyway we rotate
>
> the
>
> > >> Offsite tapes every Tuesday, and supposedly there is an update
>
> slots run
>
> > >> with all drives released at the conclusion of the procedure which
>
> should
>
> > >> update the database as to the current state of the auto-changer.
> > >>
> > >> Now that the back story is established, what has been extremely
> > >> frustrating is that a decent percentage of the time, something
>
> occurs
>
> > >> which places the tapes out of sync, and come Saturday night (the
>
> first
>
> > >> night a drive would have to swap) the auto-changer fails to load a
>
> new
>
> > >> tape it is looking for in the OnsiteFull pool, due to the tape that
>
> was
>
> > >> in the drive failing to unload due to a slot full condition. Bacula
>
> now
>
> > >> requests user intervention loading the tape, and the drive is
>
> marked
>
> > >> unloaded (because the error didn't occur during an unload event,
>
> but a
>
> > >> load event, which makes it a pain to determine what tape is
>
> actually
>
> > >> loaded in the drive currently). To fix this, one must run an update
> > >> slots, then look back in the logs to figure out what tape failed to
> > >> unload, then "load" that tape into the drive, and Bacula will then
> > >> realize the drive is usable again, and then proceed as normal. Of
>
> course
>
> > >> due to the times we run backups, this has to occur in the middle of
>
> the
>
> > >> night, or pot entially the next day which impacts backups, and the
> > >> general network.
> > >>
> > >> I believe this is an error condition that could reasonably be dealt
>
> with
>
> > >> programmatically instead of requiring user intervention (An
>
> automatic
>
> > >> slot refresh before unloading tapes / loading tapes (with an
>
> assumed
>
> > >> lifetime validity of say 10 minutes to reduce occurrences) would be
>
> one
>
> > >> solution).
> > >>
> > >> Let me know if I need to add anything further, as I tried to be as
> > >> detailed as possible in this response, as compared to the quick
>
> summary
>
> > >> of the actual feature request. From a user prospective, I do agree
>
> that
>
> > >> auto-changer support feels more tacked on than anything (for
>
> example,
>
> > >> the requiring to specify a drive instead of an auto-changer when
>
> doing
>
> > >> an update slots command) and would love to see improvements in that
> > >> regard.
> > >>
> > >> -Blake
>
> ------------------------------------------------------------------------
>
> > >>- This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
>
> Don't
>
> > >> miss this year's exciting event. There's still time to save $100.
>
> Use
>
> > >> priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
>
> > >>ava one _______________________________________________
> > >> Bacula-devel mailing list
> > >> Bacula-devel AT lists.sourceforge DOT net
> > >> https://lists.sourceforge.net/lists/listinfo/bacula-devel
>
> ------------------------------------------------------------------------
> -
>
> > > This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> > > Don't miss this year's exciting event. There's still time to save
>
> $100.
>
> > > Use priority code J8TL2D2.
>
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
> a
>
> > >vaone _______________________________________________
> > > Bacula-users mailing list
> > > Bacula-users AT lists.sourceforge DOT net
> > > https://lists.sourceforge.net/lists/listinfo/bacula-users
>
> ------------------------------------------------------------------------
> -
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference
> Don't miss this year's exciting event. There's still time to save $100.
> Use priority code J8TL2D2.
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/j
> avaone
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> -----------------------------------------
> Egg is a trading name of the Egg group of companies which includes:
> Egg Financial Intermediation Ltd (reg no 3828289) and Egg Banking
> plc (reg no 2999842). Egg Banking plc and Egg Financial
> Intermediation Ltd are authorised and regulated by the Financial
> Services Authority (FSA) and are entered in the FSA register under
> numbers 205621 and 309551 respectively. These members of the Egg
> group are registered in England and Wales. Registered office:
> Citigroup Centre, Canada Square, London E14 5LB.
>
> This e-mail is confidential and for use by the addressee only. If
> you are not the intended recipient of this e-mail and have received
> it in error, please return the message to the sender by replying to
> it and then delete it from your mailbox. Internet e-mails are not
> necessarily secure. The Egg group of companies do not accept
> responsibility for changes made to this message after it was sent.
>
> Whilst all reasonable care has been taken to avoid the transmission
> of viruses, it is the responsibility of the recipient to ensure
> that the onward transmission, opening or use of this message and
> any attachments will not adversely affect its systems or data. No
> responsibility is accepted by the Egg group of companies in this
> regard and the recipient should carry out such virus and other
> checks as it considers appropriate.
>
> This communication does not create or modify any contract.



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>
  • Re: [Bacula-users] [Bacula-devel] Minor SD Feature Request, Kern Sibbald <=