Bacula-users

Re: [Bacula-users] slow performance copy/migrate disk to tape

2008-11-06 10:26:48
Subject: Re: [Bacula-users] slow performance copy/migrate disk to tape
From: Sebastian Lehmann <slehmann AT proficom-ag DOT de>
To: Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
Date: Thu, 6 Nov 2008 16:18:38 +0100 (CET)
Hi,

Am Do 06.11.2008 15:54 schrieb Ulrich Leodolter
<ulrich.leodolter AT obvsg DOT at>:

> Hi,
> 
> On Thu, 2008-11-06 at 14:57 +0100, Sebastian Lehmann wrote:
> > Hi,
> > 
> > Am Di 04.11.2008 21:02 schrieb Ulrich Leodolter
> > <ulrich.leodolter AT obvsg DOT at>:
> > 
> > > Hi,
> > > 
> > > Problem:  Migrate/Copy jobs from disk pool (DiskBackup)
> > > to tape (DiskCopy) only get overall speed of 10-20MB/s.
> > > 
> > > full backup job size varies from 10-50GB.
> > > 
> > > Pool setup is simple, just one pool for full an incremental
> > > backups
> > > to disk (automatic recycle works good)
> > > 
> > > Pool {
> > >   Name = DiskBackup
> > >   Pool Type = Backup
> > >   Recycle = yes
> > >   RecyclePool = DiskBackup
> > >   AutoPrune = yes
> > >   Volume Retention = 15 days
> > >   Volume Use Duration = 6 days
> > >   Maximum Volume Bytes = 4G
> > >   Label Format = Backup-
> > >   Next Pool = DiskCopy
> > > }
> > > 
> > > DiskCopy pool goes to LTO4 tape device.
> > > DiskBackup goes to SATA external raid (6T ext3).
> > > 
> > > concurrency for DiskBackup jobs is 15,  jobs are spread
> > > over DiskBackup Valumes (maybe thats the main problem)
> > > 
> > > i can read write/continuous on both devices at about 70MB/s
> > > (SATA is not fast)
> > > 
> > > i tried spooling to local SAS raid, but overall speed is lower
> > > than direct writing to tape.
> > > despooling from SAS raid to Tape runs at Tape maximum spped.
> > > 
> > > I need some Performance tuning tips, maybe:
> > > 
> > 
> > we use bacula in the same way, but with version 2.4.3, so we only
> > use
> > migration instead of copy job.
> > 
> > Maybe you have the same problem as we have. Performance looks ok, if
> > i
> > use tools other then bacula to test it.
> > 
> > We assume, that the seek process on disk based volumes not really
> > works.
> > 
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The following 1
> > JobIds
> > were chosen to be migrated: 38638
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Migration using
> > JobId=38638 Job=TCB-PROJECT_DB-WEB.2008-11-03_22.00.14
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Bootstrap records
> > written to /var/lib/bacula/dss-bacula-dir.restore.28.bsr
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The job will require
> > the
> > following
> >    Volume(s)                 Storage(s)                SD Device(s)
> > 
> >
> > ===========================================================================
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252:    
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 00132 File
> > FileAutoChanger
> > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 
> > 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Start Migration
> > JobId
> > 39252, Job=MigrateToTape.2008-11-06_06.00.48
> > 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Using Device
> > "SL500-1-Drive-1"
> > 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Ready to read from
> > volume
> > "00132" on device "FileStorage" (/stage0/drive0).
> > 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Forward spacing
> > Volume
> > "00132" to file:block 21:974164913.
> > 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of Volume at file
> > 45
> > on device "FileStorage" (/stage0/drive0), Volume "00132"
> > 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of all volumes.
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Bacula
> > dss-bacula-dir
> > 2.4.2 (26Jul08): 06-Nov-2008 13:37:52
> >   Build OS:               x86_64-pc-linux-gnu debian 4.0
> >   Prev Backup JobId:      38638
> >   New Backup JobId:       39253
> >   Migration JobId:        39252
> >   Migration Job:          MigrateToTape.2008-11-06_06.00.48
> >   Backup Level:           Full
> >   Client:                 BACULA-DIR
> >   FileSet:                "LinuxFullSet" 2008-04-23 22:00:00
> >   Read Pool:              "Disk" (From Job resource)
> >   Read Storage:           "File" (From Pool resource)
> > Write Pool: "MONTH-TAPE" (From Job Pool's NextPool resource)
> > Write Storage: "SL500-1" (From Storage from Pool's NextPool
> > resource)
> >   Start time:             06-Nov-2008 13:25:54
> >   End time:               06-Nov-2008 13:37:52
> >   Elapsed time:           11 mins 58 secs
> >   Priority:               10
> >   SD Files Written:       51
> >   SD Bytes Written:       37,420,967 (37.42 MB)
> >   Rate:                   52.1 KB/s
> >   Volume name(s):         000211
> >   Volume Session Id:      3
> >   Volume Session Time:    1225972440
> >   Last Volume Bytes:      31,316,447,232 (31.31 GB)
> >   SD Errors:              0
> >   SD termination status:  OK
> >   Termination:            Migration OK
> > 
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Jobs.
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Pruned 2 Jobs for
> > client
> > BACULA-DIR from catalog.
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Files.
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: No Files found to
> > prune.
> > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: End auto prune.
> > 
> > As you can see, bacula needs 12 minutes to migrate 37MB. Thats
> > becuase
> > is does not "jump" to the correct file position in the disk volume.
> > It
> > reads the entire volume from the begining to the end if the job was
> > the
> > last on it. You can see this with stat sd. Bacula reads the volume
> > but
> > does not write to the tape the most time.
> 
> stat sd ???
> 
At the bconsole "stat sd" then select your storage and device.
->
Connecting to Director 10.92.79.168:9101
1000 OK: dss-bacula-dir Version: 2.4.2 (26 July 2008)
Enter a period to cancel a command.
*stat sd
The defined Storage resources are:
     1: File
     2: SL500-1
Select Storage resource (1-2): 1
Connecting to Storage daemon File at dss-bacula:9103

dss-bacula-sd Version: 2.4.2 (26 July 2008) x86_64-pc-linux-gnu debian
4.0
Daemon started 06-Nov-08 14:28, 10 Jobs run since started.
Heap: heap=7,319,552 smbytes=6,878,613 max_bytes=6,938,811 bufs=366
 max_bufs=545
Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8
<snip>
Device parameters:
Archive name: /dev/nst0 Device name: SL500-1-Drive-1
File=35 block=4725
Min block=0 Max block=0
    Total Bytes=28,337,799,168 Blocks=439,263 Bytes/block=64,512
    Positioned at File=35 Block=4,725

Her you can see the total bytes bacula has read from the current volume.

> "Forward spacing Volume" is done by lseek() on File volumes,
> dont think bacula does positioning by reading the disk volume.
> You can time my little lseek program (see attachment)
> on large disk volumes using different offset values.
> For me large offsets (>1GB) make no difference in timing.
> 

Ok, forgot to tell that we use a virtual changer with disk based
volumes, so Media type is "File".

> > 
> > We assume its a configuration problem but do not find any.
> 
> Maybe also check your tape Device in bacula-sd.conf
> 
> Device {
>  ...
>  Always Open = yes; 
>  ,,,
> }
> 
> If you don't have "Always Open = yes", tape devices are spooled back
> and
> forth, because bacula does open/close at least for each migration job.
>
As the documentation said this option will be ignored for media type
file.

Ok, what i see is, that lseek at the begining of the job seems to work:

> dss-bacula-sd: askdir.c:368-0 >dird CatReq
> Job=MigrateToTape.2008-11-06_06.00.27 UpdateMedia VolName=000212
> VolJobs=13 VolFiles=38 VolBlocks=459721 VolBytes=29657585664
> VolMounts=3 VolErrors=0 VolWrites=6764239 MaxVolBytes=0
> EndTime=1225984266 VolStatus=Append Slot=13 relabel=0 InChanger=1
> VolReadTime=0 VolWriteTime=2478623570 VolFirstWritten=0 VolParts=0
dss-bacula-sd: askdir.c:182-0 <dird 1000 OK VolName=000212 VolJobs=13
VolFiles=38 VolBlocks=459721 VolBytes=29657585664 VolMounts=3
VolErrors=0 VolWrites=6764239 MaxVolBytes=0 VolCapacityBytes=0
VolStatus=Append Slot=13 MaxVolJobs=0 MaxVolFiles=0 InChanger=1
VolReadTime=0 VolWriteTime=2478623570 EndFile=37 EndBlock=4956
VolParts=0 LabelType=0 MediaId=35
dss-bacula-sd: askdir.c:205-0 do_reqest_vol_info return true slot=13
Volume=000212
dss-bacula-sd: reserve.c:490-0 Dec reserve=0 dev="SL500-1-Drive-1"
(/dev/nst0)
dss-bacula-sd: dev.c:1580-0 ===== lseek to -1686243990

Bacula starts with high transfer rates, then only reads the source
volume up to his end.

courious, i don't understand this.

Sebastian


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users