Bacula-users

Re: [Bacula-users] Max Wait Time sometimes crash Storage Daemon

2008-05-15 15:02:26
Subject: Re: [Bacula-users] Max Wait Time sometimes crash Storage Daemon
From: Kern Sibbald <kern AT sibbald DOT com>
To: Adam Cécile <adam.cecile AT linbox DOT com>
Date: Thu, 15 May 2008 12:52:27 -0400
Hello Adam,

If the SD is crashing, then there is definitely a bug and you should open a 
bug report.  It would be preferable if you move up to version 2.2.8 as it 
simplifies things for me in debugging and finding the problems.

Best regards,

Kern

On Thursday 15 May 2008 09:05:02 Adam Cécile wrote:
> Hello,
>
> I use Max Wait Time to cancel jobs that are left in queue because no
> tapes are available.
> This is useful when our customers forget to load a new set of tapes into
> the changer.
>
> The problem is that SD crashes in this case, here a sample of logs:
> 03-May 12:01 pdc1.it-lyon-sd JobId 1580: Please mount Volume "Daily-005"
> or label a new one for:
> Job: pdc1.it-lyon.2008-05-02_21.00.26
> Storage: "Dell-LTO2" (/dev/nst0)
> Pool: Friday
> Media type: LTO2
>
> Then:
> 02-May 21:00 pdc1.it-lyon-dir JobId 1582: Start Backup JobId 1582,
> Job=intox1.it-lyon.2008-05-02_21.00.28
> 02-May 21:01 pdc1.it-lyon-dir JobId 1582: Using Device "Dell-LTO2"
> 06-May 11:10 intox1.it-lyon-fd: intox1.it-lyon.2008-05-02_21.00.28 Fatal
> error: job.c:1808 Comm error with SD. bad response to Append Data.
> ERR=Aucune donnée disponible
> 06-May 11:11 pdc1.it-lyon-dir JobId 1582: Error: Bacula pdc1.it-lyon-dir
> 2.2.5 (09Oct07): 06-May-2008 11:11:01
>
> Bacula-sd processus sometimes wipes, sometimes it keeps running but
> doesn't work anymore until we restart it.
>
> Another log example:
>
> 06-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or
> label a new one for:
> Job: atp-data.2008-05-02_22.00.44
> Storage: "Drive-1" (/dev/nst0)
> Pool: Weekly
> Media type: LTO3
> 07-mai 12:23 localhost-sd JobId 235: Please mount Volume "000027L3" or
> label a new one for:
> Job: atp-data.2008-05-02_22.00.44
> Storage: "Drive-1" (/dev/nst0)
> Pool: Weekly
> Media type: LTO3
> 08-mai 12:23 localhost-sd JobId 235: Fatal error: Max time exceeded
> waiting to mount Storage Device "Drive-1" (/dev/nst0) for Job
> atp-data.2008-05-02_22.00.44
> 08-mai 12:23 localhost-sd JobId 235: Job write elapsed time = 134:15:41,
> Transfer rate = 3.350 M bytes/second
> 08-mai 12:23 localhost-fd JobId 235: Fatal error: backup.c:892 Network
> send error to SD. ERR=Broken pipe
> 08-mai 12:23 localhost-dir JobId 235: Error: Bacula localhost-dir 2.2.8
> (26Jan08): 08-mai-2008 12:23:41
>
> This is a serious issue as Max Wait Time can't be used (always crash).
>
> Could you please tell me if this is a known issue or not ? If not, a
> customer is okay to "forget to change the tape" so I can provide you
> some debugging backtraces if needed.
>
> Thanks in advance,
>
> Best regards, Adam.



-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft 
Defy all challenges. Microsoft(R) Visual Studio 2008. 
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users