On Thu, 2008-11-06 at 16:18 +0100, Sebastian Lehmann wrote:
> Hi,
>
> Am Do 06.11.2008 15:54 schrieb Ulrich Leodolter
> <ulrich.leodolter AT obvsg DOT at>:
>
> > Hi,
> >
> > On Thu, 2008-11-06 at 14:57 +0100, Sebastian Lehmann wrote:
> > > Hi,
> > >
> > > Am Di 04.11.2008 21:02 schrieb Ulrich Leodolter
> > > <ulrich.leodolter AT obvsg DOT at>:
> > >
> > > > Hi,
> > > >
> > > > Problem: Migrate/Copy jobs from disk pool (DiskBackup)
> > > > to tape (DiskCopy) only get overall speed of 10-20MB/s.
> > > >
> > > > full backup job size varies from 10-50GB.
> > > >
> > > > Pool setup is simple, just one pool for full an incremental
> > > > backups
> > > > to disk (automatic recycle works good)
> > > >
> > > > Pool {
> > > > Name = DiskBackup
> > > > Pool Type = Backup
> > > > Recycle = yes
> > > > RecyclePool = DiskBackup
> > > > AutoPrune = yes
> > > > Volume Retention = 15 days
> > > > Volume Use Duration = 6 days
> > > > Maximum Volume Bytes = 4G
> > > > Label Format = Backup-
> > > > Next Pool = DiskCopy
> > > > }
> > > >
> > > > DiskCopy pool goes to LTO4 tape device.
> > > > DiskBackup goes to SATA external raid (6T ext3).
> > > >
> > > > concurrency for DiskBackup jobs is 15, jobs are spread
> > > > over DiskBackup Valumes (maybe thats the main problem)
> > > >
> > > > i can read write/continuous on both devices at about 70MB/s
> > > > (SATA is not fast)
> > > >
> > > > i tried spooling to local SAS raid, but overall speed is lower
> > > > than direct writing to tape.
> > > > despooling from SAS raid to Tape runs at Tape maximum spped.
> > > >
> > > > I need some Performance tuning tips, maybe:
> > > >
> > >
> > > we use bacula in the same way, but with version 2.4.3, so we only
> > > use
> > > migration instead of copy job.
> > >
> > > Maybe you have the same problem as we have. Performance looks ok, if
> > > i
> > > use tools other then bacula to test it.
> > >
> > > We assume, that the seek process on disk based volumes not really
> > > works.
> > >
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The following 1
> > > JobIds
> > > were chosen to be migrated: 38638
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Migration using
> > > JobId=38638 Job=TCB-PROJECT_DB-WEB.2008-11-03_22.00.14
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: Bootstrap records
> > > written to /var/lib/bacula/dss-bacula-dir.restore.28.bsr
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252:
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: The job will require
> > > the
> > > following
> > > Volume(s) Storage(s) SD Device(s)
> > >
> > >
> > > ===========================================================================
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252:
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252: 00132 File
> > > FileAutoChanger
> > > 2008-11-06 06:00:03 dss-bacula-dir JobId 39252:
> > > 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Start Migration
> > > JobId
> > > 39252, Job=MigrateToTape.2008-11-06_06.00.48
> > > 2008-11-06 13:25:54 dss-bacula-dir JobId 39252: Using Device
> > > "SL500-1-Drive-1"
> > > 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Ready to read from
> > > volume
> > > "00132" on device "FileStorage" (/stage0/drive0).
> > > 2008-11-06 13:25:54 dss-bacula-sd JobId 39252: Forward spacing
> > > Volume
> > > "00132" to file:block 21:974164913.
> > > 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of Volume at file
> > > 45
> > > on device "FileStorage" (/stage0/drive0), Volume "00132"
> > > 2008-11-06 13:37:49 dss-bacula-sd JobId 39252: End of all volumes.
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Bacula
> > > dss-bacula-dir
> > > 2.4.2 (26Jul08): 06-Nov-2008 13:37:52
> > > Build OS: x86_64-pc-linux-gnu debian 4.0
> > > Prev Backup JobId: 38638
> > > New Backup JobId: 39253
> > > Migration JobId: 39252
> > > Migration Job: MigrateToTape.2008-11-06_06.00.48
> > > Backup Level: Full
> > > Client: BACULA-DIR
> > > FileSet: "LinuxFullSet" 2008-04-23 22:00:00
> > > Read Pool: "Disk" (From Job resource)
> > > Read Storage: "File" (From Pool resource)
> > > Write Pool: "MONTH-TAPE" (From Job Pool's NextPool resource)
> > > Write Storage: "SL500-1" (From Storage from Pool's NextPool
> > > resource)
> > > Start time: 06-Nov-2008 13:25:54
> > > End time: 06-Nov-2008 13:37:52
> > > Elapsed time: 11 mins 58 secs
> > > Priority: 10
> > > SD Files Written: 51
> > > SD Bytes Written: 37,420,967 (37.42 MB)
> > > Rate: 52.1 KB/s
> > > Volume name(s): 000211
> > > Volume Session Id: 3
> > > Volume Session Time: 1225972440
> > > Last Volume Bytes: 31,316,447,232 (31.31 GB)
> > > SD Errors: 0
> > > SD termination status: OK
> > > Termination: Migration OK
> > >
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Jobs.
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Pruned 2 Jobs for
> > > client
> > > BACULA-DIR from catalog.
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: Begin pruning Files.
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: No Files found to
> > > prune.
> > > 2008-11-06 13:37:52 dss-bacula-dir JobId 39252: End auto prune.
> > >
> > > As you can see, bacula needs 12 minutes to migrate 37MB. Thats
> > > becuase
> > > is does not "jump" to the correct file position in the disk volume.
> > > It
> > > reads the entire volume from the begining to the end if the job was
> > > the
> > > last on it. You can see this with stat sd. Bacula reads the volume
> > > but
> > > does not write to the tape the most time.
> >
> > stat sd ???
> >
> At the bconsole "stat sd" then select your storage and device.
> ->
> Connecting to Director 10.92.79.168:9101
> 1000 OK: dss-bacula-dir Version: 2.4.2 (26 July 2008)
> Enter a period to cancel a command.
> *stat sd
> The defined Storage resources are:
> 1: File
> 2: SL500-1
> Select Storage resource (1-2): 1
> Connecting to Storage daemon File at dss-bacula:9103
>
> dss-bacula-sd Version: 2.4.2 (26 July 2008) x86_64-pc-linux-gnu debian
> 4.0
> Daemon started 06-Nov-08 14:28, 10 Jobs run since started.
> Heap: heap=7,319,552 smbytes=6,878,613 max_bytes=6,938,811 bufs=366
> max_bufs=545
> Sizes: boffset_t=8 size_t=8 int32_t=4 int64_t=8
> <snip>
> Device parameters:
> Archive name: /dev/nst0 Device name: SL500-1-Drive-1
> File=35 block=4725
> Min block=0 Max block=0
> Total Bytes=28,337,799,168 Blocks=439,263 Bytes/block=64,512
> Positioned at File=35 Block=4,725
>
> Her you can see the total bytes bacula has read from the current volume.
>
> > "Forward spacing Volume" is done by lseek() on File volumes,
> > dont think bacula does positioning by reading the disk volume.
> > You can time my little lseek program (see attachment)
> > on large disk volumes using different offset values.
> > For me large offsets (>1GB) make no difference in timing.
> >
>
> Ok, forgot to tell that we use a virtual changer with disk based
> volumes, so Media type is "File".
>
> > >
> > > We assume its a configuration problem but do not find any.
> >
> > Maybe also check your tape Device in bacula-sd.conf
> >
> > Device {
> > ...
> > Always Open = yes;
> > ,,,
> > }
> >
> > If you don't have "Always Open = yes", tape devices are spooled back
> > and
> > forth, because bacula does open/close at least for each migration job.
> >
> As the documentation said this option will be ignored for media type
> file.
>
> Ok, what i see is, that lseek at the begining of the job seems to work:
>
> > dss-bacula-sd: askdir.c:368-0 >dird CatReq
> > Job=MigrateToTape.2008-11-06_06.00.27 UpdateMedia VolName=000212
> > VolJobs=13 VolFiles=38 VolBlocks=459721 VolBytes=29657585664
> > VolMounts=3 VolErrors=0 VolWrites=6764239 MaxVolBytes=0
> > EndTime=1225984266 VolStatus=Append Slot=13 relabel=0 InChanger=1
> > VolReadTime=0 VolWriteTime=2478623570 VolFirstWritten=0 VolParts=0
> dss-bacula-sd: askdir.c:182-0 <dird 1000 OK VolName=000212 VolJobs=13
> VolFiles=38 VolBlocks=459721 VolBytes=29657585664 VolMounts=3
> VolErrors=0 VolWrites=6764239 MaxVolBytes=0 VolCapacityBytes=0
> VolStatus=Append Slot=13 MaxVolJobs=0 MaxVolFiles=0 InChanger=1
> VolReadTime=0 VolWriteTime=2478623570 EndFile=37 EndBlock=4956
> VolParts=0 LabelType=0 MediaId=35
> dss-bacula-sd: askdir.c:205-0 do_reqest_vol_info return true slot=13
> Volume=000212
> dss-bacula-sd: reserve.c:490-0 Dec reserve=0 dev="SL500-1-Drive-1"
> (/dev/nst0)
> dss-bacula-sd: dev.c:1580-0 ===== lseek to -1686243990
>
> Bacula starts with high transfer rates, then only reads the source
> volume up to his end.
but where is the unnecessary read in source code???
>
> courious, i don't understand this.
me too :-)
Hi
i also noticed that bacula reads much more as needed.
sar output below is recorded during Copy disk to tape
while spooling to disk is enabled.
i would expect bread/s ~ 2 * bwrtn/s
but as u see its up to 10 times
12:00:01 AM tps rtps wtps bread/s bwrtn/s
....
09:30:01 AM 2078.93 224.37 1854.56 40730.89 29267.22
09:40:02 AM 4043.36 1540.13 2503.23 275342.38 38271.24
09:50:01 AM 3595.00 1603.98 1991.02 306152.36 30414.74
10:00:01 AM 2383.88 1508.10 875.78 282534.59 13318.83
10:10:01 AM 5514.69 1502.04 4012.65 247868.19 61412.10
10:20:02 AM 6106.74 1577.26 4529.48 263784.18 69347.99
10:30:02 AM 4080.24 1637.17 2443.06 301743.85 37310.45
10:40:01 AM 1547.23 363.86 1183.37 57264.02 18028.25
Ulrich
>
> Sebastian
>
--
Ulrich Leodolter <ulrich.leodolter AT obvsg DOT at>
OBVSG
-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
Bacula-users mailing list
Bacula-users AT lists.sourceforge DOT net
https://lists.sourceforge.net/lists/listinfo/bacula-users
|