Bacula-users

[Bacula-users] Need help debugging SD crash

2010-04-04 15:53:48
Subject: [Bacula-users] Need help debugging SD crash
From: Robert LeBlanc <robert AT leblancnet DOT us>
To: "bacula-users (anglais)" <bacula-users AT lists.sourceforge DOT net>
Date: Sun, 4 Apr 2010 13:20:49 -0600
I'm having problems with our SD and tapes being locked in the drive occasionally. At first I thought this might be a problem with our tape library. Then I saw these errors in the syslog. I switched out the Qlogic FC adapter thinking that maybe it was just losing all the paths to the drive. I'm still getting the errors, so I'm not sure where the hangup is. I can't tell if it's a bug in the kernel module, mt or bacula. Can someone give me some pointers to narrowing this down? This has been happening for over a year and through several kernel and bacula versions.

This is Debian Squeeze

Linux lsddomainsd 2.6.32-trunk-686 #1 SMP Sun Jan 10 06:32:16 UTC 2010 i686 GNU/Linux

bacula-sd: invalid option -- 'V'
Copyright (C) 2000-2010 Free Software Foundation Europe e.V.

Version: 5.0.1 (24 February 2010)

mt-st v. 1.1


Apr  4 07:08:23 lsddomainsd kernel: [137640.964059] INFO: task bacula-sd:12439 blocked for more than 120 seconds.
Apr  4 07:08:23 lsddomainsd kernel: [137640.980153] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr  4 07:08:23 lsddomainsd kernel: [137640.996879] bacula-sd     D f7e47b88     0 12439      1 0x00000000
Apr  4 07:08:23 lsddomainsd kernel: [137640.996889]  f551aec0 00200086 00000020 f7e47b88 00200246 c13f4000 c13f4000 c13ef604
Apr  4 07:08:23 lsddomainsd kernel: [137640.996901]  f551b07c c3808000 00000000 f7fc856f f570c060 f8024000 f6d72820 00000000
Apr  4 07:08:23 lsddomainsd kernel: [137640.996911]  c3803604 f551b07c 020b2e87 00000001 f6d72800 00000000 00000000 00000000
Apr  4 07:08:23 lsddomainsd kernel: [137640.996919] Call Trace:
Apr  4 07:08:23 lsddomainsd kernel: [137640.996957]  [<f7fc856f>] ? qla2x00_start_scsi+0x29b/0x2cc [qla2xxx]
Apr  4 07:08:23 lsddomainsd kernel: [137640.996969]  [<c1259f49>] ? schedule_timeout+0x20/0xb0
Apr  4 07:08:23 lsddomainsd kernel: [137640.996976]  [<c1122934>] ? blk_peek_request+0x135/0x143
Apr  4 07:08:23 lsddomainsd kernel: [137640.996988]  [<f7e32987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997000]  [<f7e37382>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997006]  [<c1259e52>] ? wait_for_common+0xa4/0x100
Apr  4 07:08:23 lsddomainsd kernel: [137640.997014]  [<c102da50>] ? default_wake_function+0x0/0x8
Apr  4 07:08:23 lsddomainsd kernel: [137640.997019]  [<f92fe756>] ? st_scsi_execute_end+0x0/0x45 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997024]  [<f92fee28>] ? st_do_scsi+0x28d/0x2b5 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997028]  [<f92ffb81>] ? st_int_ioctl+0x624/0xa68 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997034]  [<c11be12a>] ? release_sock+0xf/0x7f
Apr  4 07:08:23 lsddomainsd kernel: [137640.997040]  [<c11f0c2a>] ? tcp_sendmsg+0x69d/0x77a
Apr  4 07:08:23 lsddomainsd kernel: [137640.997044]  [<f92ff92e>] ? st_int_ioctl+0x3d1/0xa68 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997050]  [<c11bbb11>] ? __sock_sendmsg+0x43/0x4a
Apr  4 07:08:23 lsddomainsd kernel: [137640.997055]  [<f930198a>] ? st_ioctl+0xb1b/0xe62 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997059]  [<c1259e5d>] ? wait_for_common+0xaf/0x100
Apr  4 07:08:23 lsddomainsd kernel: [137640.997065]  [<c10b1aa2>] ? do_sync_write+0xc0/0x107
Apr  4 07:08:23 lsddomainsd kernel: [137640.997070]  [<f9300e6f>] ? st_ioctl+0x0/0xe62 [st]
Apr  4 07:08:23 lsddomainsd kernel: [137640.997075]  [<c10bc220>] ? vfs_ioctl+0x1c/0x5f
Apr  4 07:08:23 lsddomainsd kernel: [137640.997079]  [<c10bc7b4>] ? do_vfs_ioctl+0x4aa/0x4e5
Apr  4 07:08:23 lsddomainsd kernel: [137640.997083]  [<c10b17ee>] ? fsnotify_modify+0x5a/0x61
Apr  4 07:08:23 lsddomainsd kernel: [137640.997087]  [<c10b23ee>] ? vfs_write+0x9e/0xd6
Apr  4 07:08:23 lsddomainsd kernel: [137640.997091]  [<c10bc830>] ? sys_ioctl+0x41/0x58
Apr  4 07:08:23 lsddomainsd kernel: [137640.997097]  [<c10030fb>] ? sysenter_do_call+0x12/0x28
Apr  4 07:10:23 lsddomainsd kernel: [137760.996059] INFO: task bacula-sd:12439 blocked for more than 120 seconds.
Apr  4 07:10:23 lsddomainsd kernel: [137761.012949] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Apr  4 07:10:23 lsddomainsd kernel: [137761.030339] bacula-sd     D f7e47b88     0 12439      1 0x00000000
Apr  4 07:10:23 lsddomainsd kernel: [137761.030346]  f551aec0 00200086 00000020 f7e47b88 00200246 c13f4000 c13f4000 c13ef604
Apr  4 07:10:23 lsddomainsd kernel: [137761.030355]  f551b07c c3808000 00000000 f7fc856f f570c060 f8024000 f6d72820 00000000
Apr  4 07:10:23 lsddomainsd kernel: [137761.030363]  c3803604 f551b07c 020b2e87 00000001 f6d72800 00000000 00000000 00000000
Apr  4 07:10:23 lsddomainsd kernel: [137761.030371] Call Trace:
Apr  4 07:10:23 lsddomainsd kernel: [137761.030409]  [<f7fc856f>] ? qla2x00_start_scsi+0x29b/0x2cc [qla2xxx]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030421]  [<c1259f49>] ? schedule_timeout+0x20/0xb0
Apr  4 07:10:23 lsddomainsd kernel: [137761.030428]  [<c1122934>] ? blk_peek_request+0x135/0x143
Apr  4 07:10:23 lsddomainsd kernel: [137761.030439]  [<f7e32987>] ? scsi_dispatch_cmd+0x185/0x1e5 [scsi_mod]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030449]  [<f7e37382>] ? scsi_request_fn+0x3c1/0x47a [scsi_mod]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030454]  [<c1259e52>] ? wait_for_common+0xa4/0x100
Apr  4 07:10:23 lsddomainsd kernel: [137761.030460]  [<c102da50>] ? default_wake_function+0x0/0x8
Apr  4 07:10:23 lsddomainsd kernel: [137761.030466]  [<f92fe756>] ? st_scsi_execute_end+0x0/0x45 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030470]  [<f92fee28>] ? st_do_scsi+0x28d/0x2b5 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030474]  [<f92ffb81>] ? st_int_ioctl+0x624/0xa68 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030480]  [<c11be12a>] ? release_sock+0xf/0x7f
Apr  4 07:10:23 lsddomainsd kernel: [137761.030486]  [<c11f0c2a>] ? tcp_sendmsg+0x69d/0x77a
Apr  4 07:10:23 lsddomainsd kernel: [137761.030490]  [<f92ff92e>] ? st_int_ioctl+0x3d1/0xa68 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030496]  [<c11bbb11>] ? __sock_sendmsg+0x43/0x4a
Apr  4 07:10:23 lsddomainsd kernel: [137761.030501]  [<f930198a>] ? st_ioctl+0xb1b/0xe62 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030504]  [<c1259e5d>] ? wait_for_common+0xaf/0x100
Apr  4 07:10:23 lsddomainsd kernel: [137761.030511]  [<c10b1aa2>] ? do_sync_write+0xc0/0x107
Apr  4 07:10:23 lsddomainsd kernel: [137761.030515]  [<f9300e6f>] ? st_ioctl+0x0/0xe62 [st]
Apr  4 07:10:23 lsddomainsd kernel: [137761.030521]  [<c10bc220>] ? vfs_ioctl+0x1c/0x5f
Apr  4 07:10:23 lsddomainsd kernel: [137761.030525]  [<c10bc7b4>] ? do_vfs_ioctl+0x4aa/0x4e5
Apr  4 07:10:23 lsddomainsd kernel: [137761.030529]  [<c10b17ee>] ? fsnotify_modify+0x5a/0x61
Apr  4 07:10:23 lsddomainsd kernel: [137761.030533]  [<c10b23ee>] ? vfs_write+0x9e/0xd6
Apr  4 07:10:23 lsddomainsd kernel: [137761.030537]  [<c10bc830>] ? sys_ioctl+0x41/0x58
Apr  4 07:10:23 lsddomainsd kernel: [137761.030543]  [<c10030fb>] ? sysenter_do_call+0x12/0x28

Thanks,

Robert LeBlanc
Life Sciences & Undergraduate Education Computer Support
Brigham Young University
------------------------------------------------------------------------------
Download Intel&#174; Parallel Studio Eval
Try the new software tools for yourself. Speed compiling, find bugs
proactively, and fine-tune applications for parallel performance.
See why Intel Parallel Studio got high marks during beta.
http://p.sf.net/sfu/intel-sw-dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
<Prev in Thread] Current Thread [Next in Thread>