Bacula-users

Re: [Bacula-users] [Fwd: Almost empty full tapes]

2008-04-23 16:27:31
Subject: Re: [Bacula-users] [Fwd: Almost empty full tapes]
From: Arno Lehmann <al AT its-lehmann DOT de>
To: bacula-users AT lists.sourceforge DOT net
Date: Wed, 23 Apr 2008 22:26:55 +0200
Hi,

23.04.2008 18:31, Quinton Jansen wrote:
> On my PX506 loaded with LTO3 SCSI tape drives and bacula 2.2.9-b6, I've
> 
> been experiencing a number of occasions where the tapes are not actually 
> filled.
> 
> No error messages in dmesg.

That's a good start... though it makes things more difficult :-)

> btape won't write to the tape anymore..  read-only

Ok... that would be because the tapes are labeled, I hope.

If that's not the case, it would help to know the exact status of 
btape and the tape drive at that time.

> mt will write just fine

Well, mt doesn't know about bacula labels and so on, so that's not 
really telling us much.

> Updating volstatus to append and remounting did not work.

In which way did it not work? I suppose it didn't work in the way that 
Bacula immediately assumed the tape to be full, right?

> Does anybody have a hint or solution to solve the problem?

Yup... first make sure that this behaviour also happens when you run a 
  non-beta version or the latest cvs code. If it doesn't happen with 
the released version, report your observations to -devel. It might 
also help to only run one job to one drive at a time, so dsable job 
concurrency. If the error vanishes then, run one job per drive, but 
use all drives. Also, run several jobs to one drive and use only a 
single drive. After that, it might already be easy to see if the error 
is related to the multi-drive multi-job logic, which seems to be under 
heavy work recently...

Then, if you can observe the problem with the released version, verify 
what *really* is on the tapes, and what happens when Bacula decides 
they are full.

The latter is rather simple - start with the job report, which should 
state why a tape is changed. Probably Bacula will tell you the end of 
the medium was encountered. Then run the SD with debug output enabled 
and see if that reports anything funny (note that, at a useable level, 
that will produce *lots* of text. It might be better to run only one 
job at a time for this test...

If there is nothing wrong from the device-end, i.e. you don't see any 
end-of-tape thingies where there shouldn't be any, it's time to dig 
into the source code depths... which is better done with a bug report 
at bugs.bacula.org.

If you find the tape drive report tapes as used it's time to look why 
that happens. First, make sure that no other software accesses the 
autochanger or tapes while Bacula has them accessed. For example, if 
Bacula has a tape open but doesn't write to it it will silently assume 
no other process or machine uses it. If, in such a case, another 
machine writes data to the tape, Bacula won't know about that and the 
volumes will be a) mostly unusable by Bacula, and b) reported to hold 
much less data than really is stored on them.

You can check the tape's contents from Baculas point of view using bls 
. If bls reports any funny blocks of data or files on a volume it's 
time to look for their source.

Hope this gives you some ideas how to investigate,

Arno


> Quinton
> 
> volumename | volstatus | slot | inchanger | lastwritten | volbytes | 
> volfiles | name
> ------------+-----------+------+-----------+---------------------+--------------+----------+------------
> BJB767L3 | Full | 21 | 1 | 2008-04-19 04:14:35 | 942971904 | 1 | PX506-Full
> BJB765L3 | Full | 23 | 1 | 2008-04-19 04:16:39 | 270434304 | 1 | PX506-Full
> BJB764L3 | Full | 24 | 1 | 2008-04-19 04:18:38 | 509515776 | 1 | PX506-Full
> BJB763L3 | Full | 25 | 1 | 2008-04-19 04:21:14 | 844655616 | 1 | PX506-Full
> BJB762L3 | Full | 26 | 1 | 2008-04-19 04:23:00 | 73866240 | 1 | PX506-Full
> BJB761L3 | Full | 27 | 1 | 2008-04-19 04:25:06 | 445777920 | 1 | PX506-Full
> BJB760L3 | Full | 28 | 1 | 2008-04-19 04:27:04 | 24321024 | 1 | PX506-Full
> BJB799L3 | Full | 29 | 1 | 2008-04-19 04:29:04 | 302819328 | 1 | PX506-Full
> BJB798L3 | Full | 30 | 1 | 2008-04-19 04:31:09 | 464615424 | 1 | PX506-Full
> AAA043L3 | Full | 57 | 1 | 2008-04-19 04:33:41 | 765434880 | 1 | PX506-Full
> AAA023L3 | Full | 81 | 1 | 2008-04-19 04:35:38 | 232630272 | 1 | PX506-Full
> LLE733L3 | Full | 51 | 1 | 2008-04-19 04:37:46 | 318044160 | 1 | PX506-Full
> BJB768L3 | Full | 20 | 1 | 2008-04-19 11:43:48 | 129189150720 | 130 | 
> PX506-Full
> BJB795L3 | Full | 14 | 1 | 2008-04-19 11:48:26 | 170491216896 | 174 | 
> PX506-Full
> BJB769L3 | Full | 19 | 1 | 2008-04-19 20:02:39 | 579621288960 | 580 | 
> PX506-Full
> BJB766L3 | Full | 22 | 1 | 2008-04-20 00:07:00 | 593693614080 | 594 | 
> PX506-Full
> 
> snippet from bacula-sd:
> PX506-A is used for full backups
> PX506-B is used for daily diff backups
> 
> Autochanger {
>  Name = PX506-A
>  Device = "PX506-0"
>  Device = "PX506-1"
>  Device = "PX506-2"
>  Device = "PX506-3"
>  Device = "PX506-4"
>  Changer Command = "/etc/bacula/mtx-changer-px506 %c %o %S %a %d"
>  Changer Device = /dev/sg0
> }
> 
> Autochanger {
>  Name = PX506-B
>  Device = "PX506-5"
>  #Device = "PX506-4"
>  Changer Command = "/etc/bacula/mtx-changer-px506 %c %o %S %a %d"
>  Changer Device = /dev/sg0
> }
> 
> Device {
>  Name = "PX506-0"
>  Device Type = Tape
>  Drive Index = 0       # SCSI ID 1
>  Media Type = LTO-3
>  Archive Device = /dev/nst0
>  AutomaticMount = yes;               # when device opened, read it
>  AlwaysOpen = yes;
>  RemovableMedia = yes;
>  RandomAccess = no;
>  AutoChanger = yes
> #  # Enable the Alert command only if you have the mtx package loaded
>  Alert Command = "sh -c 'tapeinfo -f %c |grep TapeAlert|cat'"
>  Spool Directory = /var/bacula/spool/px506-0
>  Maximum Network Buffer Size = 65536
>  Maximum Spool Size = 1024G
>  Maximum Job Spool Size = 30G
>  Maximum Changer Wait = 600
>  Maximum Open Wait = 600
>  LabelMedia = yes
> }
> 
> 
> PX506-[1-5] configured the same (different device and spool location).
> 
> snippet from bacula-dir.conf:
> Storage {
>  Name = PX506-Econet-Full
>  Address = bottom.pyr.ec.gc.ca
>  SDPort = 9103
>  Password = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
>  Device = PX506-A
>  Media Type = LTO-3
>  Maximum Concurrent Jobs = 25
>  Autochanger = yes
> }
> 
> Storage {
>  Name = PX506-Econet-Diff
>  Address = bottom.pyr.ec.gc.ca
>  SDPort = 9103
>  Password = "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"
>  Device = PX506-B
>  Media Type = LTO-3
>  Maximum Concurrent Jobs = 10
>  Autochanger = yes
> }
> 
> 
> 
> 
> -------------------------------------------------------------------------
> This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
> Don't miss this year's exciting event. There's still time to save $100. 
> Use priority code J8TL2D2. 
> http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
> _______________________________________________
> Bacula-users mailing list
> Bacula-users AT lists.sourceforge DOT net
> https://lists.sourceforge.net/lists/listinfo/bacula-users
> 

-- 
Arno Lehmann
IT-Service Lehmann
www.its-lehmann.de

-------------------------------------------------------------------------
This SF.net email is sponsored by the 2008 JavaOne(SM) Conference 
Don't miss this year's exciting event. There's still time to save $100. 
Use priority code J8TL2D2. 
http://ad.doubleclick.net/clk;198757673;13503038;p?http://java.sun.com/javaone
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

<Prev in Thread] Current Thread [Next in Thread>