Bacula-users

Re: [Bacula-users] Questions regarding migration job failure

2011-05-12 12:58:22
Subject: Re: [Bacula-users] Questions regarding migration job failure
From: Jerry Lowry <jlowry AT edt DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Thu, 12 May 2011 09:58:14 -0700
thanks for the help.  Looks like I have some digging to do to figure out what is actually happening.  I know that I one time I had some problems with the raid controller.  I have since gotten that resolved. 

If the volume has been recycled will the corruption remain with the volume or will it go by the wayside once the volume recycles?  Just curious as to whether I should drop the corrupt volumes ( files ) and create new ones.

On 5/12/2011 12:31 AM, Graham Keeling wrote:
On Wed, May 11, 2011 at 02:06:44PM -0700, Jerry Lowry wrote:
another mistake on my part.  You have to give bls the correct spelling  
of the volume ( sometimes I wonder )

Once I corrected the volume name this is the results I get:

Volume Record: File:blk=0: 206 Sessid=16 SessTime=1303843290 Jobid=3  
DataLen=171
11-May 13:42 bls JobId 0: Error: block.c:318 Volumne data error at 0:206!
Block checksum mismatch in block=6010112 len=64512: calc=c6a6912d  
blk=50a7d773
Well, that's the problem right there.
Your migration doesn't work when volumes that are not corrupted are being read.

As to how your volumes got corrupted, that's a much harder question.

If it were me, I would start everything from scratch, and after every backup
run your 'bls' command on any volume that changed. This will let you catch
the problem just after it happened, and you might be able to spot anything
strange that happened before that.

(assuming that it is a bacula bug, rather than you having a disk or a file
system problem)

I ran this again with debug at level 200. I have attached the file with  
the output.

thanks for all your help!

On 5/11/2011 12:11 PM, Jerry Lowry wrote:
Hi,

No, the migration job is occurring on the same storage daemon.  This  
storage daemon has 6 raid devices setup as jbod, 3 are for daily use  
and 3 are setup as hotswap devices for off-site backups.  The problem  
is when I run bls on the storage daemon where the disks are located I  
get a message asking me to mount the disk, which is already mounted  
according to the director, as well as being mounted by the OS.



On 5/11/2011 11:26 AM, Phil Stracchino wrote:
On 05/11/11 13:48, Jerry Lowry wrote:
Ok, I am trying to run bls on the specified volume file that is
associated with this job. But the problem I am having is that bls is
failing trying to stat the device.

I have one director and two storage directors.  The volume I am trying
to run against is on the second SD.  Do I run bls on the system where
the 'director' is or on the system thats running the stand alone 'sd'
where the volume is located?
Jerry,
If I'm understanding you correctly, you have two storage daemons, and
you're trying to do a migration from a device on one SD to a device on
the other.  Is this correct?

If this understanding is correct, sorry, it won't work.  Copy and
migration can currently only be done between devices controlled by the
same SD.  (This is in large part a result of there being no current
capability for direct communication between one storage daemon and another.)


-- 

---------------------------------------------------------------------------
Jerold Lowry
IT Manager / Software Engineer
Engineering Design Team (EDT), Inc. a HEICO company
1400 NW Compton Drive, Suite 315
Beaverton, Oregon 97006 (U.S.A.)
Phone: 503-690-1234 / 800-435-4320
Fax: 503-690-1243
Web: _www.edt.com <http://www.edt.com/>_



------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay


_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
-- 

---------------------------------------------------------------------------
Jerold Lowry
IT Manager / Software Engineer
Engineering Design Team (EDT), Inc. a HEICO company
1400 NW Compton Drive, Suite 315
Beaverton, Oregon 97006 (U.S.A.)
Phone: 503-690-1234 / 800-435-4320
Fax: 503-690-1243
Web: _www.edt.com <http://www.edt.com/>_



[jlowry@distress-sd bin]$ ./bls -d 200 -j -v -v -V home-0006 -c /etc/bacula/bacula-sd.conf /Home
bls: stored_conf.c:698-0 Inserting director res: distress-mon
bls: stored_conf.c:698-0 Inserting device res: DBB
bls: stored_conf.c:698-0 Inserting device res: Hardware
bls: stored_conf.c:698-0 Inserting device res: Swift
bls: stored_conf.c:698-0 Inserting device res: Home
bls: stored_conf.c:698-0 Inserting device res: Workstations
bls: stored_conf.c:698-0 Inserting device res: TopSwap
bls: stored_conf.c:698-0 Inserting device res: MidSwap
bls: stored_conf.c:698-0 Inserting device res: BottomSwap
bls: stored_conf.c:698-0 Inserting device res: FileStorage
bls: stored_conf.c:698-0 Inserting device res: FileStorage1
bls: stored_conf.c:698-0 Inserting device res: Drive-1
bls: match.c:250-0 add_fname_to_include prefix=0 gzip=0 fname=/
bls: butil.c:281 Using device: "/Home" for reading.
bls: dev.c:284-0 init_dev: tape=0 dev_name=/Home
bls: vol_mgr.c:162-0 add read_vol=home-0006 JobId=0
bls: butil.c:186-0 Acquire device for read
bls: acquire.c:95-0 Want Vol=home-0006 Slot=0
bls: acquire.c:109-0 MediaType dcr= dev=File
bls: acquire.c:189-0 dir_get_volume_info vol=home-0006
bls: bls.c:486-0 Fake dir_get_volume_info
bls: mount.c:546-0 Must load "Home" (/Home)
bls: autochanger.c:120-0 Device "Home" (/Home) is not an autochanger
bls: acquire.c:220-0 bstored: open vol=home-0006
bls: dev.c:360-0 open dev: type=1 dev_name="Home" (/Home) vol=home-0006 mode=OPEN_READ_ONLY
bls: dev.c:369-0 call open_file_device mode=OPEN_READ_ONLY
bls: dev.c:2089-0 Enter mount
bls: dev.c:542-0 open disk: mode=OPEN_READ_ONLY open(/Home/home-0006, 0x0, 0640)
bls: dev.c:557-0 open dev: disk fd=3 opened, part=0/0, part_size=0
bls: dev.c:373-0 preserve=0x0 fd=3
bls: acquire.c:228-0 opened dev "Home" (/Home) OK
bls: acquire.c:231-0 calling read-vol-label
bls: label.c:81-0 Enter read_volume_label res=0 device="Home" (/Home) vol=home-0006 dev_Vol=*NULL*
bls: label.c:130-0 Big if statement in read_volume_label
bls: label.c:820-0 unser_vol_label

Volume Label:
Id                : Bacula 1.0 immortal
VerNo             : 11
VolName           : home-0006
PrevVolName       : 
VolFile           : 0
LabelType         : VOL_LABEL
LabelSize         : 171
PoolName          : HomePool
MediaType         : File
PoolType          : Backup
HostName          : distress-sd
Date label written: 01-May-2011 14:50
bls: label.c:202-0 Compare Vol names: VolName=home-0006 hdr=home-0006

Volume Label:
Id                : Bacula 1.0 immortal
VerNo             : 11
VolName           : home-0006
PrevVolName       : 
VolFile           : 0
LabelType         : VOL_LABEL
LabelSize         : 171
PoolName          : HomePool
MediaType         : File
PoolType          : Backup
HostName          : distress-sd
Date label written: 01-May-2011 14:50
bls: label.c:223-0 Leave read_volume_label() VOL_OK
bls: label.c:236-0 Call reserve_volume=home-0006
bls: vol_mgr.c:352-0 enter reserve_volume=home-0006 drive="Home" (/Home)
bls: vol_mgr.c:268-0 new Vol=home-0006 at ae0bc8 dev="Home" (/Home)
bls: vol_mgr.c:470-0 === set in_use. vol=home-0006 dev="Home" (/Home)
bls: vol_mgr.c:211-0 List end new volume: home-0006 in_use=1 on device "Home" (/Home)
bls: acquire.c:235-0 Got correct volume.
11-May 13:54 bls JobId 0: Ready to read from volume "home-0006" on device "Home" (/Home).
bls: label.c:820-0 unser_vol_label

Volume Label:
Id                : Bacula 1.0 immortal
VerNo             : 11
VolName           : home-0006
PrevVolName       : 
VolFile           : 0
LabelType         : VOL_LABEL
LabelSize         : 171
PoolName          : HomePool
MediaType         : File
PoolType          : Backup
HostName          : distress-sd
Date label written: 01-May-2011 14:50

Volume Label:
Id                : Bacula 1.0 immortal
VerNo             : 11
VolName           : home-0006
PrevVolName       : 
VolFile           : 0
LabelType         : VOL_LABEL
LabelSize         : 171
PoolName          : HomePool
MediaType         : File
PoolType          : Backup
HostName          : distress-sd
Date label written: 01-May-2011 14:50
11-May 13:54 bls JobId 0: Error: block.c:318 Volume data error at 0:206!
Block checksum mismatch in block=6010112 len=64512: calc=c6a6912d blk=50a7d773
bls: butil.c:298-0 Device status: 84
bls: acquire.c:457-0 release_device device "Home" (/Home) is disk
bls: acquire.c:466-0 dir_update_vol_info. label=64 Vol=home-0006
bls: vol_mgr.c:179-0 remove_read_vol=home-0006 JobId=0 found=1
bls: vol_mgr.c:211-0 List remove_read_volume: home-0006 in_use=1 on device "Home" (/Home)
bls: vol_mgr.c:594-0 === set not reserved vol=home-0006 num_writers=0 dev_reserved=0 dev="Home" (/Home)
bls: vol_mgr.c:595-0 === clear in_use vol=home-0006
bls: vol_mgr.c:623-0 === clear in_use vol=home-0006
bls: vol_mgr.c:626-0 === remove volume home-0006 dev="Home" (/Home)
bls: acquire.c:514-0 0 writers, 0 reserve, dev="Home" (/Home)
bls: dev.c:1924-0 close_dev "Home" (/Home)
bls: dev.c:2123-0 Enter unmount
bls: dev.c:1913-0 Clear volhdr vol=home-0006
bls: vol_mgr.c:616-0 No vol on dev "Home" (/Home)
bls: acquire.c:551-0 JobId=0 broadcast wait_device_release at 11-May-2011 13:54:55
bls: acquire.c:561-0 ===== Device "Home" (/Home) released by JobId=0
bls: mem_pool.c:370-0 garbage collect memory pool
bls: dev.c:1924-0 close_dev "Home" (/Home)
bls: dev.c:1931-0 device "Home" (/Home) already closed vol=

------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users


------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

--

---------------------------------------------------------------------------
Jerold Lowry
IT Manager / Software Engineer
Engineering Design Team (EDT), Inc. a HEICO company
1400 NW Compton Drive, Suite 315
Beaverton, Oregon 97006 (U.S.A.)
Phone: 503-690-1234 / 800-435-4320
Fax: 503-690-1243
Web:
www.edt.com

 


------------------------------------------------------------------------------
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users