Bacula-users

Re: [Bacula-users] how to debug a job

2015-01-23 10:08:19
Subject: Re: [Bacula-users] how to debug a job
From: Josh Fisher <jfisher AT pvct DOT com>
To: bacula-users AT lists.sourceforge DOT net
Date: Fri, 23 Jan 2015 10:03:05 -0500

On 1/22/2015 12:47 PM, Dimitri Maziuk wrote:
On 01/22/2015 10:07 AM, Josh Fisher wrote:

There is likely no reason to have SpoolData=yes for disk volumes, and it
could actually slow things down.
On 01/22/2015 07:32 AM, Clark, Patricia A. wrote:

Spooling is for the benefit of tape drives and databases.  What is the
benefit of spooling data for virtual tapes that are really disks?
1. Job interleaving.
2. "If you are running multiple simultaneous jobs, Bacula will continue
spooling other jobs while one is despooling to tape, provided there is
sufficient spool file space." (TFM - "Data Spooling" - "Other points".)

10 clients, (up to) 10 concurrent jobs, vchanger with one "device" and
plenty of pre-labeled 50GB file volumes, 50GB max spool size/job.

I use another approach. I define the vchanger to have multiple virtual drives. Then in the bacula-sd.conf  Device stanzas for each drive I put MaximumConcurrentJobs=1. This forces a volume file to only be written by one job at a time. So for 10 concurrent jobs there would be 10 virtual drives and so 10 opened volume files. This way is not necessarily better. It depends on several things. For example, if the spool storage is much faster than the volume file storage, then the way you are doing it might get clients finished faster. On the other hand, if they are close in performance, then the way I'm doing it might be better, since there are 10 separate volumes being written and no job has to wait on another except for the attribute de-spooling.

Also, with 10 concurrent files, the vchanger can use multiple physical drives by using pools and dividing the jobs amongst them so that concurrent jobs are writing to volume files on different spindles.

With either method, attribute spooling is used. The SD always spools attributes for all jobs to the Bacula work directory. De-spooling attributes is single-threaded. So single-thread performance of the work directory storage is important and of course DB storage random write performance is crucial, even with attribute spooling.

Another reason for using the multiple open volume file approach is that in some cases it is better to NOT spool attributes. This would be when the DB storage is on fast SSD and direct DB i/o is simply faster than spooling and then de-spooling from spinning disk. In this case, data spooling must be disabled because Bacula forces attribute spooling whenever data spooling is enabled.



Is one of the jobs running concurrently with the failing job backing up
the machine SD is running on?
Yesno: it's not backing up any of the usual suspects (the spool disk,
/var/log, var/lib/pgdata, ...)

Certainly, the spool directory should not be on the same disk drives
that the database is on.
Right. Spool is on a separate tler drive.

It does look like the spool disk might have been the bottleneck indeed,
and the same client failing over and over again was just a coincidence
-- perhaps it has more data than the others of its kind. I won't know
until more jobs get to run, but so far the manually started full backup
looks promising: 100+ MB/s write speed on the spool. I might switch it
to xfs with barriers off and/or get an SSD for it.



------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet


_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users

------------------------------------------------------------------------------
New Year. New Location. New Benefits. New Data Center in Ashburn, VA.
GigeNET is offering a free month of service with a new server in Ashburn.
Choose from 2 high performing configs, both with 100TB of bandwidth.
Higher redundancy.Lower latency.Increased capacity.Completely compliant.
http://p.sf.net/sfu/gigenet
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users