Bacula-users

Re: [Bacula-users] Vchanger attempts to mount 5 volumes before stopping and requesting magazine change

2013-04-12 04:56:53
Subject: Re: [Bacula-users] Vchanger attempts to mount 5 volumes before stopping and requesting magazine change
From: Leon White <leon.white AT greenpeace DOT org>
To: Bill Arlofski <waa-bacula AT revpol DOT com>
Date: Fri, 12 Apr 2013 16:47:52 +0800
Hi guys,

sorry for the delay in replying.

To clarify what I mean by 'without typing anything in the terminal', I was hoping to try out the web interface in the future to see if it can be used to start jobs. If the benablemag and update slots commands could be automated in some way, this would make the system sufficiently user friendly that I could just document the mechanism behind it and move on to other projects. Hopefully a few practice runs would be all someone needs to understand the process of switching the drives. Ideally, a message would pop up in the web UI saying 'drive full, insert next drive' or something similarly user friendly and idiot proof.

That said, I am sticking to manually issuing the commands for now until I at least have a full backup completed. I decided not to use Bill's new script just yet, mainly because it takes a long time to cycle through all the non-mounted drives and disable them, even if they are already disabled. We have around 18 magazines (2TB drives have 2 magazines), each with 186 volumes, so it just took too long watching text scroll on the screen for what should be a quick drive switch operation. Instead, I issue something like 'benablemag 5 0' followed by 'benablemag 6 1', and set all drives to disabled immediately after formatting and adding the UUID to vchanger config.

This has created a new problem: during the full backup which I have been running over the past days, if a drive sits in the changer for a few hours before the next drive in sequence is loaded, the bconsole messages indicate that bacula is automatically "Pruning oldest volume "gpea-backup-0010-0001"" or something similar. This is obviously because I haven't played with the volume recycling settings yet. Does this mean that the backup set is incomplete? Have the volumes been erased in some way, or just marked as next in line for recycling, but not overwritten?

Thanks for the help, this is quite an adventure! 

Cheers,
Leon


On Wed, Apr 3, 2013 at 11:00 PM, Bill Arlofski <waa-bacula AT revpol DOT com> wrote:
On 04/03/13 04:59, Leon White wrote:
> Hi Bill and Josh,
>
> I just want to say again how helpful your support and development skills have
> been, not just for our system here but also helping a relative novice like me
> get my head around scripting approaches!

Happy to help out. :)


> Bill, your latest script serves to check which magazine is currently inserted
> and then runs benablemag with the appropriate variable for that magazine, correct?

Yes.   BUT - more importantly - it also calls Josh's benablemag script to
DISABLE all the volumes on any magazines that are not available at the time
you run my script. That is important to remember.


> Our use case here involves very infrequent backups (6 month cycle) of very
> large amounts of data (30 TB). As such, we don't have much use for timed jobs
> because we plan to run it manually once the hard drives have been retrieved
> from safe storage every 6 months.

Ahhh. I see.


> The main issue is the ability for a luddite
> to be able to switch drives in the autochanger "when the light stops flashing"
> without typing anything into any terminal anywhere.
> Preparing the drives for
> first use and then disabling all volumes works fine with my clunky scripts,
> but I wonder how to integrate the "benablemag $thismag 1" command with udev so
> that a newly inserted drive is enabled by the system (rather than a bacula
> job), followed by "bcommand update slots" or something similar (as this
> command currently also fails). This should only occur for drives passing
> through the /etc/auto.vchanger rule, or whatever the equivalent is in udev
> (I'm still very fuzzy on this point). Then after the customary 3 minute wait,
> backup would continue with no user interaction. Rinse and repeat until the job
> is done.


Leon, I am confused as to when or how you intend to start a job. You do
mention "we plan to run it manually" but you also say "without typing anything
into any terminal anywhere"

Those two things are somewhat mutually exclusive.

Even if it has to be "just unplug drive 1, plug in drive 2 and walk away"
simple, you still need a way to start a job.


If someone needs to manually start a job, then I would still say use a RunScript.

But instead of my previous recommendation of a scheduled "Admin" job to update
the database of available volumes (using my script and Josh's), and a bconsole
update slots command, I would just include those two in the actual backup job
as a RunScript.

So, then it goes like this:   LEDs stop flashing, swap drive(s) out, in
bconsole type:   run Job=Our_30TB_Job  <enter>

Walk away as db gets updated with currently available volumes, update slots is
run, and then the job starts backing up data.

Leon, in my opinion even that is too simplistic. I mean that just because
there are no LEDs flashing, it does not necessarily mean a job is not
currently running. Bacula could be moving volumes in/out of a drive, waiting
on some type of operator intervention etc.

So, I'd say that no matter what, someone should be running a bare minimum of a
few commands at the console before swapping drives out:

* messages (to clear the console message log and quickly see if there are any
"obvious" issues)

* status dir  (to see if all jobs have completed and no jobs are
queued/waiting etc, see if drive is blocked etc)

* cancel (if necessary clean up and cancel any jobs that have an issue)

Otherwise, if a job is just requiring another magazine change, swap out
magazine, then run run the Admin job to enable/disable volumes, and update
slots. Bacula should continue job after that.

If all jobs are done, swap out agazines and:
* run Job=Our_30TB_Job  (to start the next job)

Of course, if you are emailing messages to an admin-type person, then most
issues would be pre-known, and possibly solved remotely before the worker is
on-site to swap magazines but... you know...

> Sorry this is a little long! Takes me a while to think through it all. If
> either of you wants to pick this up as a supported job, we do have some
> limited funding available :)

I might be interested, you may contact me off-list if we can't get you up and
running via the list. But I would be happy to help in here if we can get past
what I think is just a little confusion - maybe a little on your part and a
little on mine.  :)

--
Bill Arlofski
Reverse Polarity, LLC



--
陆智诚 | Leon White 
绿色和平 | Greenpeace East Asia
+86 186 0692 9781 | Skype: strophy
行动,带来改变。 Positive change through action.
------------------------------------------------------------------------------
Precog is a next-generation analytics platform capable of advanced
analytics on semi-structured data. The platform includes APIs for building
apps and a phenomenal toolset for data science. Developers can use
our toolset for easy data analysis & visualization. Get a free account!
http://www2.precog.com/precogplatform/slashdotnewsletter
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users