Bacula-users

Re: [Bacula-users] Max Wait Time sometimes crash Storage Daemon

2008-05-19 02:52:10
Subject: Re: [Bacula-users] Max Wait Time sometimes crash Storage Daemon
From: Adam Cécile <adam.cecile AT linbox DOT com>
To: Kern Sibbald <kern AT sibbald DOT com>
Date: Mon, 19 May 2008 08:52:07 +0200
Hi,

Could you please tell me more about how to get a useful traceback ?

Thanks in advance,

Regards, Adam.

Kern Sibbald a écrit :
> On Friday 16 May 2008 03:09:26 Adam Cécile wrote:
>   
>> Reported as #1087:
>> http://bugs.bacula.org/view.php?id=1087
>>     
>
> OK, thanks.  If you haven't already done so, please attach a traceback when it
> crashes, as well as your bacula-dir.conf and bacula-sd.conf files.
>
> Thanks,
>
> Kern
>
>   
>> Best regards, Adam.
>>
>> Kern Sibbald a écrit :
>>     
>>> Hello Adam,
>>>
>>> If the SD is crashing, then there is definitely a bug and you should open
>>> a bug report.  It would be preferable if you move up to version 2.2.8 as
>>> it simplifies things for me in debugging and finding the problems.
>>>
>>> Best regards,
>>>
>>> Kern
>>>
>>> On Thursday 15 May 2008 09:05:02 Adam Cécile wrote:
>>>       
>>>> Hello,
>>>>
>>>> I use Max Wait Time to cancel jobs that are left in queue because no
>>>> tapes are available.
>>>> This is useful when our customers forget to load a new set of tapes into
>>>> the changer.
>>>>
>>>> The problem is that SD crashes in this case, here a sample of logs:
>>>> 03-May 12:01 pdc1.it-lyon-sd JobId 1580: Please mount Volume "Daily-005"
>>>> or label a new one for:
>>>> Job: pdc1.it-lyon.2008-05-02_21.00.26
>>>> Storage: "Dell-LTO2" (/dev/nst0)
>>>> Pool: Friday
>>>> Media type: LTO2
>>>>
>>>> Then:
>>>> 02-May 21:00 pdc1.it-lyon-dir JobId 1582: Start Backup JobId 1582,
>>>> Job=intox1.it-lyon.2008-05-02_21.00.28
>>>> 02-May 21:01 pdc1.it-lyon-dir JobId 1582: Using Device "Dell-LTO2"
>>>> 06-May 11:10 intox1.it-lyon-fd: intox1.it-lyon.2008-05-02_21.00.28 Fatal
>>>> error: job.c:1808 Comm error with SD. bad response to Append Data.
>>>> ERR=Aucune donnée disponible
>>>> 06-May 11:11 pdc1.it-lyon-dir JobId 1582: Error: Bacula pdc1.it-lyon-dir
>>>> 2.2.5 (09Oct07): 06-May-2008 11:11:01
>>>>
>>>> Bacula-sd processus sometimes wipes, sometimes it keeps running but
>>>> doesn't work anymore until we restart it.
>>>>
>>>> Another log example:
>>>>
>>>> 06-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or
>>>> label a new one for:
>>>> Job: atp-data.2008-05-02_22.00.44
>>>> Storage: "Drive-1" (/dev/nst0)
>>>> Pool: Weekly
>>>> Media type: LTO3
>>>> 07-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or
>>>> label a new one for:
>>>> Job: atp-data.2008-05-02_22.00.44
>>>> Storage: "Drive-1" (/dev/nst0)
>>>> Pool: Weekly
>>>> Media type: LTO3
>>>> 08-mai 12:23 localhost-sd JobId 235: Fatal error: Max time exceeded
>>>> waiting to mount Storage Device "Drive-1" (/dev/nst0) for Job
>>>> atp-data.2008-05-02_22.00.44
>>>> 08-mai 12:23 localhost-sd JobId 235: Job write elapsed time = 134:15:41,
>>>> Transfer rate = 3.350 M bytes/second
>>>> 08-mai 12:23 localhost-fd JobId 235: Fatal error: backup.c:892 Network
>>>> send error to SD. ERR=Broken pipe
>>>> 08-mai 12:23 localhost-dir JobId 235: Error: Bacula localhost-dir 2.2.8
>>>> (26Jan08): 08-mai-2008 12:23:41
>>>>
>>>> This is a serious issue as Max Wait Time can't be used (always crash).
>>>>
>>>> Could you please tell me if this is a known issue or not ? If not, a
>>>> customer is okay to "forget to change the tape" so I can provide you
>>>> some debugging backtraces if needed.
>>>>
>>>> Thanks in advance,
>>>>
>>>> Best regards, Adam.
>>>>         
>
>
>   


-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users