Bacula-users

Re: [Bacula-users] Bacula stalls after 2.5 TB

2010-12-02 17:13:08
Subject: Re: [Bacula-users] Bacula stalls after 2.5 TB
From: John Acar <JAcar AT gsoc.treas DOT gov>
To: Wolfgang Denk <wd AT denx DOT de>
Date: Thu, 2 Dec 2010 16:46:05 -0500
I have a job running right now.  It seems to be in the "stalled" state.  

status storage

Device "Drive-1" (/dev/nst1) is mounted with:
    Volume:      037178L4
    Pool:        Default
    Media type:  LTO-4
    Slot 49 is loaded in drive 0.
    Total Bytes=1,354,752 Blocks=20 Bytes/block=67,737
    Positioned at File=0 Block=21

status client

Shows no data counter changes.

   Backup Job started: 02-Dec-10 09:32
    Files=1,215 Bytes=1,040,342,163,456 Bytes/sec=39,869,018 Errors=0
    Files Examined=1,215
    Processing file: /firewall/mysql/archive/sdw_sw/sw_20090916.tar
    SDReadSeqNo=5 fd=5

Though bytes/s is going down

The file does not seem to be open so I am at a loss of what is going on with 
some of these files.


John Acar

________________________________________
From: Wolfgang Denk [wd AT denx DOT de]
Sent: Wednesday, December 01, 2010 4:01 PM
To: John Acar
Cc: bacula-users AT lists.sourceforge DOT net
Subject: Re: [Bacula-users] Bacula stalls after 2.5 TB

Dear John Acar,

In message <[email protected]> 
you wrote:
>
> I am running Bacula 3.0.2 (Mysql) on Centos 4.8.  I have a Spectra T380 
> changer with 50 tape slots.  I need to archive about 7 TB of data.  The first 
> time I attempted to back it up, the job stalled on 2.559TB and errored out.  
> I figured it might be the
> drive i used since I have had trouble with that drive so I used drive-2.  The 
> job stopped at precisely the same spot but I did not get any errors.  Bacula 
> still thinks the job is running.

Are you sure it really stops? How long did you wait?

> Running Jobs:
> JobId 440 Job TNET.2010-11-30_09.39.39_08 is running.
>     Backup Job started: 30-Nov-10 09:39
>     Files=1,229 Bytes=2,559,801,067,520 Bytes/sec=24,701,351 Errors=0
>     Files Examined=1,229
>     Processing file: /firewall/mysql/archive/sdw_netflow/netflow_20100430.tar
>     SDReadSeqNo=5 fd=5
> Director connected at: 01-Dec-10 14:26

repeat the "status dir" command a fw times. Are any of the counters
(Files, Bytes, Files Examined) and the file name changing?

What does "status storage" report?

Can you see or hear if the tape drive is active?

What does a "ls -l /firewall/mysql/archive/sdw_netflow/netflow_20100430.tar"
on that system show?  Is there any chance that this is a sparse file
with (big) holes (eventually even unintentionally, like after some
error in a RAID)?

Best regards,

Wolfgang Denk

--
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd AT denx DOT de
"One planet is all you get."

------------------------------------------------------------------------------
Increase Visibility of Your 3D Game App & Earn a Chance To Win $500!
Tap into the largest installed PC base & get more eyes on your game by
optimizing for Intel(R) Graphics Technology. Get started today with the
Intel(R) Software Partner Program. Five $500 cash prizes are up for grabs.
http://p.sf.net/sfu/intelisp-dev2dev
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users