Bacula-users

Re: [Bacula-users] slow performance copy/migrate disk to tape

2008-11-06 09:58:47
Subject: Re: [Bacula-users] slow performance copy/migrate disk to tape
From: Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
To: Sebastian Lehmann <slehmann AT proficom-ag DOT de>
Date: Thu, 06 Nov 2008 15:54:46 +0100
Hi,

On Thu, 2008-11-06 at 14:57 +0100, Sebastian Lehmann wrote:
> Hi,
> 
> Am Di 04.11.2008 21:02 schrieb Ulrich Leodolter
> <ulrich.leodolter AT obvsg DOT at>:
> 
> > Hi,
> > 
> > Problem:  Migrate/Copy jobs from disk pool (DiskBackup)
> > to tape (DiskCopy) only get overall speed of 10-20MB/s.
> > 
> > full backup job size varies from 10-50GB.
> > 
> > Pool setup is simple, just one pool for full an incremental backups
> > to disk (automatic recycle works good)
> > 
> > Pool {
> >   Name = DiskBackup
> >   Pool Type = Backup
> >   Recycle = yes
> >   RecyclePool = DiskBackup
> >   AutoPrune = yes
> >   Volume Retention = 15 days
> >   Volume Use Duration = 6 days
> >   Maximum Volume Bytes = 4G
> >   Label Format = Backup-
> >   Next Pool = DiskCopy
> > }
> > 
> > DiskCopy pool goes to LTO4 tape device.
> > DiskBackup goes to SATA external raid (6T ext3).
> > 
> > concurrency for DiskBackup jobs is 15,  jobs are spread
> > over DiskBackup Valumes (maybe thats the main problem)
> > 
> > i can read write/continuous on both devices at about 70MB/s
> > (SATA is not fast)
> > 
> > i tried spooling to local SAS raid, but overall speed is lower
> > than direct writing to tape.
> > despooling from SAS raid to Tape runs at Tape maximum spped.
> > 
> > I need some Performance tuning tips, maybe:
> > 
> 
> we use bacula in the same way, but with version 2.4.3, so we only use
> migration instead of copy job.
> 
> Maybe you have the same problem as we have. Performance looks ok, if i
> use tools other then bacula to test it.
> 
> We assume, that the seek process on disk based volumes not really works.
> 
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The following 1 JobIds
> were chosen to be migrated: 38638
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Migration using
> JobId=38638 Job=TCB-PROJECT_DB-WEB.2008-11-03_22.00.14
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Bootstrap records
> written to /var/lib/bacula/dss-bacula-dir.restore.28.bsr
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The job will require the
> following
>    Volume(s)                 Storage(s)                SD Device(s)
> 
> ===========================================================================
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252:    
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 00132 File
> FileAutoChanger
> 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 
> 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Start Migration JobId
> 39252, Job=MigrateToTape.2008-11-06_06.00.48
> 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Using Device
> "SL500-1-Drive-1"
> 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Ready to read from volume
> "00132" on device "FileStorage" (/stage0/drive0).
> 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Forward spacing Volume
> "00132" to file:block 21:974164913.
> 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of Volume at file 45
> on device "FileStorage" (/stage0/drive0), Volume "00132"
> 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of all volumes.
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Bacula dss-bacula-dir
> 2.4.2 (26Jul08): 06-Nov-2008 13:37:52
>   Build OS:               x86_64-pc-linux-gnu debian 4.0
>   Prev Backup JobId:      38638
>   New Backup JobId:       39253
>   Migration JobId:        39252
>   Migration Job:          MigrateToTape.2008-11-06_06.00.48
>   Backup Level:           Full
>   Client:                 BACULA-DIR
>   FileSet:                "LinuxFullSet" 2008-04-23 22:00:00
>   Read Pool:              "Disk" (From Job resource)
>   Read Storage:           "File" (From Pool resource)
> Write Pool: "MONTH-TAPE" (From Job Pool's NextPool resource)
> Write Storage: "SL500-1" (From Storage from Pool's NextPool resource)
>   Start time:             06-Nov-2008 13:25:54
>   End time:               06-Nov-2008 13:37:52
>   Elapsed time:           11 mins 58 secs
>   Priority:               10
>   SD Files Written:       51
>   SD Bytes Written:       37,420,967 (37.42 MB)
>   Rate:                   52.1 KB/s
>   Volume name(s):         000211
>   Volume Session Id:      3
>   Volume Session Time:    1225972440
>   Last Volume Bytes:      31,316,447,232 (31.31 GB)
>   SD Errors:              0
>   SD termination status:  OK
>   Termination:            Migration OK
> 
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Jobs.
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Pruned 2 Jobs for client
> BACULA-DIR from catalog.
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Files.
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: No Files found to prune.
> 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: End auto prune.
> 
> As you can see, bacula needs 12 minutes to migrate 37MB. Thats becuase
> is does not "jump" to the correct file position in the disk volume. It
> reads the entire volume from the begining to the end if the job was the
> last on it. You can see this with stat sd. Bacula reads the volume but
> does not write to the tape the most time.

stat sd ???

"Forward spacing Volume" is done by lseek() on File volumes,
dont think bacula does positioning by reading the disk volume.
You can time my little lseek program (see attachment)
on large disk volumes using different offset values.
For me large offsets (>1GB) make no difference in timing.

> 
> We assume its a configuration problem but do not find any.

Maybe also check your tape Device in bacula-sd.conf

Device {
 ...
 Always Open = yes; 
 ,,,
}

If you don't have "Always Open = yes", tape devices are spooled back and
forth,  because bacula does open/close at least for each migration job.

BR
Ulrich

> 
> Greetings
> Sebastian
> 
> > Limit jobs per Volume in DiskBackup pools?
> > Split DiskBackup into DiskFull and DiskIncr pools?
> > 
> > Pool {
> >   Name = DiskFull
> >   Pool Type = Backup
> >   Recycle = yes
> >   RecyclePool = DiskFull
> >   AutoPrune = yes
> >   Volume Retention = 15 days
> >   Volume Use Duration = 6 days
> >   Maximum Volume Jobs = 1
> >   Maximum Volume Bytes = 10G
> >   Label Format = Full-
> >   Next Pool = DiskCopy
> > }
> > 
> > 
> > Thx
> > Ulrich
> > -- 
> > Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
> > 
> 

-- 
Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
OBVSG

Attachment: lseek.c
Description: Text Data

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users