Bacula-users

Re: [Bacula-users] Problems with disk-based backup

2011-04-28 01:13:15
Subject: Re: [Bacula-users] Problems with disk-based backup
From: Philip Yarra <pyarra AT radoncvic.com DOT au>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 28 Apr 2011 15:10:04 +1000
An update: after leaving it to run last night, it appears that incremental jobs are trying to use volume Incr-0017, which is already used (and not appendable). So why is bacula trying to re-use volumes it is not allowed to, and how can I stop it from doing so?

I've pasted the debug output from the SD on rm-nas-1 below, you'll see it tries to use Incr-0017 for rg-server, fails, then tries to use it for fr-server. The Incr-0017 volume has already been used for wd-server - I can see it on disk.

Here's the Storage Daemon debug output:

28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:352-295 enter reserve_volume=Incr-0017 drive="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:268-295 new Vol=Incr-0017 at 87048 dev="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:470-295 === set in_use. vol=Incr-0017 dev="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:211-295 List end new volume: Incr-0017 in_use=1 on device "rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:293-295 dir_find_next_appendable_volume return true. vol=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:126-295 After find_next_append. Vol=Incr-0017 Slot=0 Parts=0
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: autochanger.c:120-295 Device "rm-nas-1-rg-server" (/c/bacula/rg-server) is not an autochanger
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:150-295 autoload_dev returns 0
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:181-295 want vol=Incr-0017 devvol= dev="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:360-295 open dev: type=1 dev_name="rm-nas-1-rg-server" (/c/bacula/rg-server) vol=Incr-0017 mode=OPEN_READ_WRITE
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:369-295 call open_file_device mode=OPEN_READ_WRITE
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:2089-295 Enter mount
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:542-295 open disk: mode=OPEN_READ_WRITE open(/c/bacula/rg-server/Incr-0017, 0x2, 0640)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:549-295 open failed: dev.c:548 Could not open: /c/bacula/rg-server/Incr-0017, ERR=No such file or directory
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:557-295 open dev: disk fd=-1 opened, part=0/0, part_size=0
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:373-295 preserve=0x0 fd=-1
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:360-295 open dev: type=1 dev_name="rm-nas-1-rg-server" (/c/bacula/rg-server) vol=Incr-0017 mode=OPEN_READ_WRITE
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:369-295 call open_file_device mode=OPEN_READ_WRITE
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:2089-295 Enter mount
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:542-295 open disk: mode=OPEN_READ_WRITE open(/c/bacula/rg-server/Incr-0017, 0x2, 0640)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:549-295 open failed: dev.c:548 Could not open: /c/bacula/rg-server/Incr-0017, ERR=No such file or directory
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:557-295 open dev: disk fd=-1 opened, part=0/0, part_size=0
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: dev.c:373-295 preserve=0x0 fd=-1
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:199-295 open_device failed: ERR=dev.c:548 Could not open: /c/bacula/rg-server/Incr-0017, ERR=No such file or directory

28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:222-295 set_unload
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:225-295 goto mount_next_vol
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:89-295 mount_next_vol retry=71
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:594-295 === set not reserved vol=Incr-0017 num_writers=0 dev_reserved=1 dev="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:595-295 === clear in_use vol=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:623-295 === clear in_use vol=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:626-295 === remove volume Incr-0017 dev="rm-nas-1-rg-server" (/c/bacula/rg-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:640-297 Back from wait_for_sysop stat=3
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:101-297 Continue after dir_ask_sysop_to_mount. must_load=0
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: mount.c:346-297 Before dir_find_next_appendable_volume.
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:260-297 dir_find_next_appendable_volume: reserved=1 Vol=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:277-297 >dird CatReq Job=fr-server-user-shares.2011-04-27_22.00.01_29 FindMedia=1 pool_name=Incremental media_type=File1
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:182-297 <dird 1000 OK VolName=Incr-0017 VolJobs=0 VolFiles=0 VolBlocks=0 VolBytes=1 VolMounts=1 VolErrors=0 VolWrites=2 MaxVolBytes=0 V
olCapacityBytes=0 VolStatus=Recycle Slot=0 MaxVolJobs=0 MaxVolFiles=0 InChanger=0 VolReadTime=0 VolWriteTime=1483 EndFile=0 EndBlock=22917 VolParts=0 LabelType=0 MediaId=17
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:206-297 do_reqest_vol_info return true slot=0 Volume=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:545-297 find_read_vol: read_vol_list empty.
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:725-297 Vol=Incr-0017 not in use.
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: askdir.c:286-297 Call reserve_volume. Vol=Incr-0017
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:352-297 enter reserve_volume=Incr-0017 drive="rm-nas-1-fr-server" (/c/bacula/fr-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:268-297 new Vol=Incr-0017 at 87048 dev="rm-nas-1-fr-server" (/c/bacula/fr-server)
28-Apr-2011 03:34:18 rm-nas-1.radoncvic.com.au: vol_mgr.c:470-297 === set in_use. vol=Incr-0017 dev="rm-nas-1-fr-server" (/c/bacula/fr-server)

Any hints?

Regards, Philip.



On 27/04/11 17:05, Philip Yarra wrote:
Hi, I'm having some troubles with bacula failing to perform automatic backups. The issue seems to be that the storage daemon is not automatically labelling a new volume. The setup looks like this:

Director lives on an Intel server running Ubuntu 10.04.2 LTS, the installed version of bacula is from the packages for this release: 5.0.1

Storage daemons live on a pair of Netgear ReadyNAS NV+ (sparc-based NAS boxes) using 5.0.1 compiled natively on a ReadyNAS box. The NV+ is running Linux 2.6.17.14ReadyNAS and I believe this is a re-packaged Debian Sarge

We have File daemons (version 5.0.3) deployed on a range of Windows 2003, 2000 and Ubuntu server boxes.

We do a full backup each Saturday morning, then each weeknight we write an incremental. The director config has has custom schedules so that alternate weeks are written to different storage daemons.

The problems seemed to start occurring when the volume names rolled over from Incr-0099 to Incr-0100 (that is the first instance I can find of this issue, anyway). I don't know if this is coincidence or not. When the problem occurred, we got this in the director's logs:

12-Apr 22:00 rm-bac-1-dir JobId 173: Start Backup JobId 173, Job=rov-impac-1-tshome.2011-04-12_22.00.00_53
12-Apr 22:00 rm-bac-1-dir JobId 173: Using Device "rm-nas-1-rov-impac-1"
12-Apr 21:53 rov-impac-1-fd JobId 173: DIR and FD clocks differ by -381 seconds, FD automatically compensating.
12-Apr 21:51 rm-nas-1.radoncvic.com.au JobId 173: Job rov-impac-1-tshome.2011-04-12_22.00.00_53 is waiting. Cannot find any appendable volumes.
Please use the "label" command to create a new Volume for:
    Storage:      "rm-nas-1-rov-impac-1" (/c/bacula/rov-impac-1)
    Pool:         Incremental
    Media type:   File1
12-Apr 22:05 rm-bac-1-dir JobId 173: Created new Volume "Incr-0101" in catalogue.
12-Apr 21:56 rm-nas-1.radoncvic.com.au JobId 173: Warning: mount.c:221 Open device "rm-nas-1-rov-impac-1" (/c/bacula/rov-impac-1) Volume "Incr-0101" failed: ERR=dev.c:548 Could not open: /c/bacula/rov-impac-1/Incr-0101, ERR=No such file or directory

There don't seem to be any permission or related issues preventing the volume being created. After re-starting the SD it is able to create new volumes (for a while, anyhow).

I can either go and manually issue a label command in bconsole, or I can go restart the SD and DIR and it'll all work again for a few days. However neither of those is really a proper solution.

If anyone can help, I would very much appreciate it.

I have attached config files for the director, and one of the storage daemons. Please let me know if there is more detail I can provide.

I will note that I have allowed up to 10 simultaneous jobs in the director - I recall reading that this was not recommended, however I was unsure if that recommendation was current, and frankly, the ability to run multiple jobs concurrently is quite desirable to use our network efficiently. If anyone thinks this might be the issue, I can try running with concurrency set to 1.

If you're interested, there's a lot more detail about the project here:
http://pyarra.blogspot.com/2011/03/disk-based-backup-solution-bacula.html
http://pyarra.blogspot.com/2011/03/bacula-mis-matched-director-and-storage.html
http://pyarra.blogspot.com/2011/03/problems-cross-compiling-bacula-501-for.html
http://pyarra.blogspot.com/2011/03/cross-compiling-i-admit-defeat.html
http://pyarra.blogspot.com/2011/03/compiling-bacula-natively-on-readynas.html

Any other comments on what I have set up would be appreciated - I think I've set up something reasonable, but independent sanity checks would be great.

My suspicion is that the issue is on the storage daemon. I've now set them to create debug level output (using-v -dt -d 200) which I have redirected to a file, so I hope to have more info in future.

Regards, Philip.


------------------------------------------------------------------------------ WhatsUp Gold - Download Free Network Management Software The most intuitive, comprehensive, and cost-effective network management toolset available today. Delivers lowest initial acquisition cost and overall TCO of any competing solution. http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________ Bacula-users mailing list Bacula-users AT lists.sourceforge DOT net https://lists.sourceforge.net/lists/listinfo/bacula-users


-- 
Philip Yarra
System Administrator
Radiation Oncology Victoria
phone: 0447 502 176
email: pyarra AT radoncvic.com DOT au
------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>